{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 背景描述\n",
    "当一个新用户进来时，系统不知道推荐什么，可以从用户看什么来进行相关性的推荐，比如靠近交通、景区等，又或者是含早餐、有电梯等特殊的，这里怎么基于不同酒店的相似度来进行推荐的。\n",
    "\n",
    "#### 基于酒店的文本描述来推荐相似酒店"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "        <script type=\"text/javascript\">\n",
       "        window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
       "        if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
       "        if (typeof require !== 'undefined') {\n",
       "        require.undef(\"plotly\");\n",
       "        requirejs.config({\n",
       "            paths: {\n",
       "                'plotly': ['https://cdn.plot.ly/plotly-latest.min']\n",
       "            }\n",
       "        });\n",
       "        require(['plotly'], function(Plotly) {\n",
       "            window._Plotly = Plotly;\n",
       "        });\n",
       "        }\n",
       "        </script>\n",
       "        "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from nltk.corpus import stopwords\n",
    "from sklearn.metrics.pairwise import linear_kernel\n",
    "from sklearn.feature_extraction.text import CountVectorizer\n",
    "from sklearn.feature_extraction.text import TfidfVectorizer\n",
    "import re\n",
    "import random\n",
    "import cufflinks  # pip install cufflinks\n",
    "import matplotlib.pyplot as plt\n",
    "from plotly.offline import iplot\n",
    "cufflinks.go_offline()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>address</th>\n",
       "      <th>desc</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Hilton Garden Seattle Downtown</td>\n",
       "      <td>1821 Boren Avenue, Seattle Washington 98101 USA</td>\n",
       "      <td>Located on the southern tip of Lake Union, the...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sheraton Grand Seattle</td>\n",
       "      <td>1400 6th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Located in the city's vibrant core, the Sherat...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Crowne Plaza Seattle Downtown</td>\n",
       "      <td>1113 6th Ave, Seattle, WA 98101</td>\n",
       "      <td>Located in the heart of downtown Seattle, the ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Kimpton Hotel Monaco Seattle</td>\n",
       "      <td>1101 4th Ave, Seattle, WA98101</td>\n",
       "      <td>What?s near our hotel downtown Seattle locatio...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>The Westin Seattle</td>\n",
       "      <td>1900 5th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Situated amid incredible shopping and iconic a...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                             name  \\\n",
       "0  Hilton Garden Seattle Downtown   \n",
       "1          Sheraton Grand Seattle   \n",
       "2   Crowne Plaza Seattle Downtown   \n",
       "3   Kimpton Hotel Monaco Seattle    \n",
       "4              The Westin Seattle   \n",
       "\n",
       "                                           address  \\\n",
       "0  1821 Boren Avenue, Seattle Washington 98101 USA   \n",
       "1   1400 6th Avenue, Seattle, Washington 98101 USA   \n",
       "2                  1113 6th Ave, Seattle, WA 98101   \n",
       "3                   1101 4th Ave, Seattle, WA98101   \n",
       "4   1900 5th Avenue, Seattle, Washington 98101 USA   \n",
       "\n",
       "                                                desc  \n",
       "0  Located on the southern tip of Lake Union, the...  \n",
       "1  Located in the city's vibrant core, the Sherat...  \n",
       "2  Located in the heart of downtown Seattle, the ...  \n",
       "3  What?s near our hotel downtown Seattle locatio...  \n",
       "4  Situated amid incredible shopping and iconic a...  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv(\"data/Seattle_Hotels.csv\", encoding=\"latin-1\")  # 西雅图酒店推荐数据\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面分别是酒店名字、地址及描述"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(152, 3)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"Located on the southern tip of Lake Union, the Hilton Garden Inn Seattle Downtown hotel is perfectly located for business and leisure. \\nThe neighborhood is home to numerous major international companies including Amazon, Google and the Bill & Melinda Gates Foundation. A wealth of eclectic restaurants and bars make this area of Seattle one of the most sought out by locals and visitors. Our proximity to Lake Union allows visitors to take in some of the Pacific Northwest's majestic scenery and enjoy outdoor activities like kayaking and sailing. over 2,000 sq. ft. of versatile space and a complimentary business center. State-of-the-art A/V technology and our helpful staff will guarantee your conference, cocktail reception or wedding is a success. Refresh in the sparkling saltwater pool, or energize with the latest equipment in the 24-hour fitness center. Tastefully decorated and flooded with natural light, our guest rooms and suites offer everything you need to relax and stay productive. Unwind in the bar, and enjoy American cuisine for breakfast, lunch and dinner in our restaurant. The 24-hour Pavilion Pantry? stocks a variety of snacks, drinks and sundries.\""
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['desc'][0]  # 查看酒店描述的个例"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 文本词频统计\n",
    "统计下酒店介绍文本里大多数描述的信息有哪些"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "vec = CountVectorizer().fit(df['desc'])  # 寄存器\n",
    "bag_of_words = vec.transform(df['desc'])  # 将文本转数值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 0, ..., 0, 0, 0],\n",
       "       [0, 0, 0, ..., 0, 0, 0],\n",
       "       [0, 0, 0, ..., 0, 0, 0],\n",
       "       ...,\n",
       "       [0, 0, 0, ..., 0, 0, 0],\n",
       "       [0, 0, 0, ..., 0, 0, 0],\n",
       "       [0, 0, 0, ..., 1, 0, 0]], dtype=int64)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bag_of_words.toarray()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(152, 3200)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bag_of_words.shape  # 一共152含对应上面的数据，其中有3200个不同的词"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "matrix([[ 1, 11, 11, ...,  2,  6,  2]], dtype=int64)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum_words = bag_of_words.sum(axis=0)  # 计算每个词重复的次数\n",
    "sum_words"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('located', 108),\n",
       " ('on', 129),\n",
       " ('the', 1258),\n",
       " ('southern', 1),\n",
       " ('tip', 1),\n",
       " ('of', 536),\n",
       " ('lake', 41),\n",
       " ('union', 33),\n",
       " ('hilton', 12),\n",
       " ('garden', 11),\n",
       " ('inn', 89),\n",
       " ('seattle', 533),\n",
       " ('downtown', 133),\n",
       " ('hotel', 295),\n",
       " ('is', 271),\n",
       " ('perfectly', 6),\n",
       " ('for', 216),\n",
       " ('business', 87),\n",
       " ('and', 1062),\n",
       " ('leisure', 18),\n",
       " ('neighborhood', 35),\n",
       " ('home', 57),\n",
       " ('to', 471),\n",
       " ('numerous', 1),\n",
       " ('major', 12),\n",
       " ('international', 32),\n",
       " ('companies', 6),\n",
       " ('including', 47),\n",
       " ('amazon', 19),\n",
       " ('google', 6),\n",
       " ('bill', 4),\n",
       " ('melinda', 4),\n",
       " ('gates', 5),\n",
       " ('foundation', 4),\n",
       " ('wealth', 1),\n",
       " ('eclectic', 8),\n",
       " ('restaurants', 35),\n",
       " ('bars', 7),\n",
       " ('make', 43),\n",
       " ('this', 63),\n",
       " ('area', 51),\n",
       " ('one', 75),\n",
       " ('most', 40),\n",
       " ('sought', 1),\n",
       " ('out', 23),\n",
       " ('by', 71),\n",
       " ('locals', 5),\n",
       " ('visitors', 4),\n",
       " ('our', 359),\n",
       " ('proximity', 8),\n",
       " ('allows', 3),\n",
       " ('take', 31),\n",
       " ('in', 449),\n",
       " ('some', 22),\n",
       " ('pacific', 42),\n",
       " ('northwest', 42),\n",
       " ('majestic', 4),\n",
       " ('scenery', 2),\n",
       " ('enjoy', 93),\n",
       " ('outdoor', 23),\n",
       " ('activities', 8),\n",
       " ('like', 46),\n",
       " ('kayaking', 3),\n",
       " ('sailing', 1),\n",
       " ('over', 14),\n",
       " ('000', 11),\n",
       " ('sq', 4),\n",
       " ('ft', 4),\n",
       " ('versatile', 3),\n",
       " ('space', 97),\n",
       " ('complimentary', 62),\n",
       " ('center', 151),\n",
       " ('state', 30),\n",
       " ('art', 44),\n",
       " ('technology', 4),\n",
       " ('helpful', 2),\n",
       " ('staff', 9),\n",
       " ('will', 46),\n",
       " ('guarantee', 3),\n",
       " ('your', 186),\n",
       " ('conference', 6),\n",
       " ('cocktail', 6),\n",
       " ('reception', 7),\n",
       " ('or', 161),\n",
       " ('wedding', 4),\n",
       " ('success', 4),\n",
       " ('refresh', 4),\n",
       " ('sparkling', 2),\n",
       " ('saltwater', 1),\n",
       " ('pool', 37),\n",
       " ('energize', 2),\n",
       " ('with', 280),\n",
       " ('latest', 4),\n",
       " ('equipment', 3),\n",
       " ('24', 42),\n",
       " ('hour', 32),\n",
       " ('fitness', 42),\n",
       " ('tastefully', 4),\n",
       " ('decorated', 4),\n",
       " ('flooded', 1),\n",
       " ('natural', 8),\n",
       " ('light', 26),\n",
       " ('guest', 57),\n",
       " ('rooms', 106),\n",
       " ('suites', 67),\n",
       " ('offer', 59),\n",
       " ('everything', 18),\n",
       " ('you', 304),\n",
       " ('need', 25),\n",
       " ('relax', 25),\n",
       " ('stay', 105),\n",
       " ('productive', 4),\n",
       " ('unwind', 11),\n",
       " ('bar', 34),\n",
       " ('american', 5),\n",
       " ('cuisine', 11),\n",
       " ('breakfast', 68),\n",
       " ('lunch', 4),\n",
       " ('dinner', 7),\n",
       " ('restaurant', 32),\n",
       " ('pavilion', 1),\n",
       " ('pantry', 2),\n",
       " ('stocks', 1),\n",
       " ('variety', 12),\n",
       " ('snacks', 9),\n",
       " ('drinks', 6),\n",
       " ('sundries', 2),\n",
       " ('city', 79),\n",
       " ('vibrant', 14),\n",
       " ('core', 5),\n",
       " ('sheraton', 8),\n",
       " ('grand', 13),\n",
       " ('provides', 9),\n",
       " ('gateway', 4),\n",
       " ('diverse', 5),\n",
       " ('sights', 2),\n",
       " ('sounds', 2),\n",
       " ('step', 11),\n",
       " ('front', 11),\n",
       " ('doors', 8),\n",
       " ('find', 31),\n",
       " ('gourmet', 1),\n",
       " ('dining', 36),\n",
       " ('world', 24),\n",
       " ('class', 13),\n",
       " ('shopping', 31),\n",
       " ('exciting', 7),\n",
       " ('entertainment', 11),\n",
       " ('iconic', 15),\n",
       " ('local', 45),\n",
       " ('attractions', 59),\n",
       " ('pike', 90),\n",
       " ('place', 102),\n",
       " ('market', 97),\n",
       " ('needle', 68),\n",
       " ('chihuly', 3),\n",
       " ('glass', 10),\n",
       " ('museum', 43),\n",
       " ('as', 117),\n",
       " ('only', 34),\n",
       " ('seven', 6),\n",
       " ('hotels', 28),\n",
       " ('north', 14),\n",
       " ('america', 5),\n",
       " ('earn', 3),\n",
       " ('esteemed', 1),\n",
       " ('designation', 1),\n",
       " ('guests', 54),\n",
       " ('can', 55),\n",
       " ('book', 11),\n",
       " ('confidently', 1),\n",
       " ('knowing', 1),\n",
       " ('they', 11),\n",
       " ('re', 64),\n",
       " ('receiving', 1),\n",
       " ('highest', 2),\n",
       " ('benchmark', 1),\n",
       " ('product', 1),\n",
       " ('service', 53),\n",
       " ('offerings', 2),\n",
       " ('available', 36),\n",
       " ('experience', 52),\n",
       " ('recently', 7),\n",
       " ('completed', 1),\n",
       " ('multimillion', 1),\n",
       " ('dollar', 1),\n",
       " ('transformation', 1),\n",
       " ('featuring', 26),\n",
       " ('all', 100),\n",
       " ('new', 17),\n",
       " ('an', 91),\n",
       " ('expanded', 5),\n",
       " ('club', 17),\n",
       " ('lounge', 20),\n",
       " ('modern', 34),\n",
       " ('meeting', 27),\n",
       " ('event', 29),\n",
       " ('spaces', 11),\n",
       " ('gather', 1),\n",
       " ('stylish', 17),\n",
       " ('lobby', 26),\n",
       " ('private', 29),\n",
       " ('collection', 6),\n",
       " ('artists', 2),\n",
       " ('while', 34),\n",
       " ('enjoying', 5),\n",
       " ('favorite', 7),\n",
       " ('beverage', 5),\n",
       " ('from', 224),\n",
       " ('starbucks', 12),\n",
       " ('features', 27),\n",
       " ('several', 5),\n",
       " ('options', 14),\n",
       " ('loulay', 1),\n",
       " ('kitchen', 17),\n",
       " ('james', 1),\n",
       " ('beard', 1),\n",
       " ('award', 13),\n",
       " ('winning', 11),\n",
       " ('chef', 3),\n",
       " ('thierry', 1),\n",
       " ('rautureau', 1),\n",
       " ('heart', 35),\n",
       " ('crowne', 5),\n",
       " ('plaza', 4),\n",
       " ('offers', 43),\n",
       " ('exceptional', 5),\n",
       " ('blend', 3),\n",
       " ('style', 27),\n",
       " ('comfort', 24),\n",
       " ('ll', 48),\n",
       " ('notice', 1),\n",
       " ('cool', 5),\n",
       " ('comfortable', 30),\n",
       " ('unconventional', 1),\n",
       " ('touches', 4),\n",
       " ('that', 65),\n",
       " ('set', 4),\n",
       " ('us', 21),\n",
       " ('apart', 1),\n",
       " ('soon', 3),\n",
       " ('inside', 7),\n",
       " ('marvel', 2),\n",
       " ('at', 231),\n",
       " ('stunning', 7),\n",
       " ('views', 39),\n",
       " ('lights', 4),\n",
       " ('relaxing', 12),\n",
       " ('sleep', 6),\n",
       " ('advantage', 12),\n",
       " ('beds', 17),\n",
       " ('wireless', 9),\n",
       " ('internet', 26),\n",
       " ('throughout', 7),\n",
       " ('amenities', 60),\n",
       " ('help', 11),\n",
       " ('temple', 1),\n",
       " ('spa', 13),\n",
       " ('tight', 1),\n",
       " ('amenity', 1),\n",
       " ('kits', 1),\n",
       " ('lavender', 1),\n",
       " ('spray', 1),\n",
       " ('lotions', 1),\n",
       " ('rejuvenate', 1),\n",
       " ('invigorating', 2),\n",
       " ('workout', 7),\n",
       " ('get', 11),\n",
       " ('suggestions', 1),\n",
       " ('expert', 2),\n",
       " ('concierge', 3),\n",
       " ('savor', 6),\n",
       " ('sumptuous', 1),\n",
       " ('regatta', 1),\n",
       " ('grille', 1),\n",
       " ('where', 27),\n",
       " ('happy', 6),\n",
       " ('daily', 10),\n",
       " ('4pm', 1),\n",
       " ('7pm', 1),\n",
       " ('monthly', 1),\n",
       " ('drink', 5),\n",
       " ('specials', 2),\n",
       " ('come', 17),\n",
       " ('emerald', 17),\n",
       " ('has', 41),\n",
       " ('what', 11),\n",
       " ('near', 48),\n",
       " ('location', 33),\n",
       " ('better', 5),\n",
       " ('question', 2),\n",
       " ('might', 5),\n",
       " ('be', 43),\n",
       " ('not', 20),\n",
       " ('nearby', 16),\n",
       " ('addition', 4),\n",
       " ('being', 1),\n",
       " ('here', 19),\n",
       " ('just', 82),\n",
       " ('small', 8),\n",
       " ('sampling', 2),\n",
       " ('rest', 4),\n",
       " ('columbia', 2),\n",
       " ('whose', 1),\n",
       " ('sky', 2),\n",
       " ('view', 11),\n",
       " ('observatory', 1),\n",
       " ('73rd', 1),\n",
       " ('floor', 11),\n",
       " ('tallest', 1),\n",
       " ('public', 6),\n",
       " ('viewing', 1),\n",
       " ('west', 15),\n",
       " ('mississippi', 1),\n",
       " ('historic', 24),\n",
       " ('5th', 4),\n",
       " ('avenue', 15),\n",
       " ('theatre', 2),\n",
       " ('musical', 1),\n",
       " ('productions', 1),\n",
       " ('central', 12),\n",
       " ('library', 7),\n",
       " ('architectural', 2),\n",
       " ('within', 36),\n",
       " ('half', 5),\n",
       " ('mile', 14),\n",
       " ('must', 5),\n",
       " ('see', 15),\n",
       " ('which', 14),\n",
       " ('houses', 5),\n",
       " ('original', 11),\n",
       " ('pioneer', 15),\n",
       " ('square', 28),\n",
       " ('fantastic', 3),\n",
       " ('flagship', 1),\n",
       " ('nordstrom', 6),\n",
       " ('rack', 1),\n",
       " ('macy', 1),\n",
       " ('sportswear', 1),\n",
       " ('louis', 1),\n",
       " ('vuitton', 1),\n",
       " ('arcteryx', 1),\n",
       " ('oodles', 1),\n",
       " ('independent', 2),\n",
       " ('boutiques', 5),\n",
       " ('great', 39),\n",
       " ('wheel', 2),\n",
       " ('washington', 67),\n",
       " ('convention', 24),\n",
       " ('about', 11),\n",
       " ('bell', 2),\n",
       " ('street', 26),\n",
       " ('pier', 5),\n",
       " ('cruise', 11),\n",
       " ('terminal', 6),\n",
       " ('66', 1),\n",
       " ('sports', 13),\n",
       " ('stadiums', 5),\n",
       " ('centurylink', 17),\n",
       " ('field', 34),\n",
       " ('safeco', 20),\n",
       " ('seahawks', 10),\n",
       " ('mariners', 9),\n",
       " ('sounders', 4),\n",
       " ('situated', 14),\n",
       " ('amid', 3),\n",
       " ('incredible', 3),\n",
       " ('westin', 1),\n",
       " ('contemporary', 12),\n",
       " ('haven', 3),\n",
       " ('prime', 3),\n",
       " ('recharge', 2),\n",
       " ('accommodations', 15),\n",
       " ('comforts', 14),\n",
       " ('signature', 14),\n",
       " ('heavenly', 2),\n",
       " ('gorgeous', 7),\n",
       " ('skyline', 8),\n",
       " ('puget', 16),\n",
       " ('sound', 21),\n",
       " ('cascade', 1),\n",
       " ('mountain', 5),\n",
       " ('range', 9),\n",
       " ('newly', 5),\n",
       " ('renovated', 10),\n",
       " ('1900', 1),\n",
       " ('fifth', 2),\n",
       " ('offering', 21),\n",
       " ('carefully', 2),\n",
       " ('curated', 4),\n",
       " ('wine', 10),\n",
       " ('crafted', 3),\n",
       " ('explore', 20),\n",
       " ('spectacular', 6),\n",
       " ('celebrated', 3),\n",
       " ('waterfront', 38),\n",
       " ('host', 9),\n",
       " ('unforgettable', 3),\n",
       " ('meetings', 16),\n",
       " ('social', 11),\n",
       " ('engagements', 1),\n",
       " ('more', 29),\n",
       " ('than', 19),\n",
       " ('70', 2),\n",
       " ('feet', 16),\n",
       " ('enhanced', 1),\n",
       " ('planning', 8),\n",
       " ('custom', 6),\n",
       " ('catering', 8),\n",
       " ('mind', 8),\n",
       " ('body', 1),\n",
       " ('sleek', 5),\n",
       " ('westinworkout', 1),\n",
       " ('studio', 11),\n",
       " ('designed', 21),\n",
       " ('reflect', 3),\n",
       " ('substance', 1),\n",
       " ('welcoming', 5),\n",
       " ('best', 50),\n",
       " ('paramount', 4),\n",
       " ('summons', 1),\n",
       " ('feel', 11),\n",
       " ('cozy', 10),\n",
       " ('elegant', 5),\n",
       " ('luxurious', 4),\n",
       " ('residence', 5),\n",
       " ('friendly', 29),\n",
       " ('hosts', 3),\n",
       " ('asian', 2),\n",
       " ('right', 15),\n",
       " ('downstairs', 1),\n",
       " ('fall', 1),\n",
       " ('love', 9),\n",
       " ('simple', 6),\n",
       " ('luxury', 18),\n",
       " ('charm', 10),\n",
       " ('boutique', 8),\n",
       " ('warm', 8),\n",
       " ('inviting', 10),\n",
       " ('wood', 7),\n",
       " ('finishes', 2),\n",
       " ('comfy', 4),\n",
       " ('seating', 7),\n",
       " ('areas', 12),\n",
       " ('fireplace', 6),\n",
       " ('classically', 1),\n",
       " ('appointed', 9),\n",
       " ('dash', 1),\n",
       " ('urban', 15),\n",
       " ('flair', 1),\n",
       " ('puts', 5),\n",
       " ('good', 4),\n",
       " ('company', 3),\n",
       " ('block', 10),\n",
       " ('walking', 21),\n",
       " ('distance', 23),\n",
       " ('cafes', 1),\n",
       " ('there', 15),\n",
       " ('are', 136),\n",
       " ('many', 31),\n",
       " ('reasons', 1),\n",
       " ('annually', 1),\n",
       " ('ranked', 1),\n",
       " ('among', 7),\n",
       " ('top', 17),\n",
       " ('five', 12),\n",
       " ('why', 4),\n",
       " ('yours', 2),\n",
       " ('shops', 14),\n",
       " ('sightseeing', 4),\n",
       " ('tour', 5),\n",
       " ('rent', 1),\n",
       " ('car', 7),\n",
       " ('if', 19),\n",
       " ('town', 13),\n",
       " ('walk', 28),\n",
       " ('via', 5),\n",
       " ('underground', 3),\n",
       " ('concourse', 1),\n",
       " ('hungry', 1),\n",
       " ('visit', 21),\n",
       " ('redtrees', 1),\n",
       " ('wide', 6),\n",
       " ('satisfy', 2),\n",
       " ('any', 12),\n",
       " ('foodie', 1),\n",
       " ('destination', 5),\n",
       " ('steps', 20),\n",
       " ('everywhere', 3),\n",
       " ('want', 11),\n",
       " ('motif', 1),\n",
       " ('welcome', 20),\n",
       " ('libation', 1),\n",
       " ('rooftop', 8),\n",
       " ('across', 16),\n",
       " ('touchstones', 1),\n",
       " ('sweeping', 2),\n",
       " ('landscape', 3),\n",
       " ('rich', 8),\n",
       " ('arts', 7),\n",
       " ('music', 16),\n",
       " ('culture', 6),\n",
       " ('infuse', 1),\n",
       " ('surroundings', 1),\n",
       " ('residences', 1),\n",
       " ('hardwoods', 1),\n",
       " ('colors', 2),\n",
       " ('inspired', 12),\n",
       " ('region', 3),\n",
       " ('culinary', 5),\n",
       " ('bounty', 3),\n",
       " ('reflected', 1),\n",
       " ('menus', 1),\n",
       " ('frolik', 1),\n",
       " ('cocktails', 5),\n",
       " ('adjoining', 2),\n",
       " ('patio', 6),\n",
       " ('join', 6),\n",
       " ('between', 5),\n",
       " ('monorail', 5),\n",
       " ('rail', 19),\n",
       " ('airport', 99),\n",
       " ('stroll', 2),\n",
       " ('away', 59),\n",
       " ('known', 8),\n",
       " ('setting', 6),\n",
       " ('trends', 2),\n",
       " ('warwick', 5),\n",
       " ('leading', 4),\n",
       " ('way', 5),\n",
       " ('upbeat', 1),\n",
       " ('belltown', 13),\n",
       " ('district', 27),\n",
       " ('blocks', 21),\n",
       " ('blends', 1),\n",
       " ('classic', 8),\n",
       " ('expected', 1),\n",
       " ('name', 2),\n",
       " ('styling', 1),\n",
       " ('boasting', 1),\n",
       " ('unique', 16),\n",
       " ('staying', 12),\n",
       " ('truly', 3),\n",
       " ('finding', 2),\n",
       " ('pleasant', 3),\n",
       " ('surprises', 1),\n",
       " ('along', 13),\n",
       " ('refreshing', 4),\n",
       " ('seaborne', 1),\n",
       " ('mists', 1),\n",
       " ('breeze', 3),\n",
       " ('evergreen', 2),\n",
       " ('covered', 4),\n",
       " ('hills', 1),\n",
       " ('lining', 1),\n",
       " ('horizon', 2),\n",
       " ('doorstep', 2),\n",
       " ('anything', 2),\n",
       " ('possible', 5),\n",
       " ('surrounded', 4),\n",
       " ('snow', 2),\n",
       " ('capped', 2),\n",
       " ('peaks', 1),\n",
       " ('deep', 3),\n",
       " ('blue', 1),\n",
       " ('waters', 3),\n",
       " ('swaths', 1),\n",
       " ('forests', 1),\n",
       " ('wild', 2),\n",
       " ('it', 55),\n",
       " ('trendy', 4),\n",
       " ('side', 4),\n",
       " ('another', 2),\n",
       " ('elliott', 10),\n",
       " ('bay', 14),\n",
       " ('gleaming', 1),\n",
       " ('wake', 6),\n",
       " ('fresh', 18),\n",
       " ('cup', 6),\n",
       " ('coffee', 38),\n",
       " ('delivered', 1),\n",
       " ('straight', 3),\n",
       " ('room', 77),\n",
       " ('then', 10),\n",
       " ('head', 5),\n",
       " ('neighbourhoods', 1),\n",
       " ('craft', 5),\n",
       " ('breweries', 3),\n",
       " ('spend', 9),\n",
       " ('day', 39),\n",
       " ('hiking', 6),\n",
       " ('up', 44),\n",
       " ('mount', 3),\n",
       " ('rainer', 1),\n",
       " ('nightfall', 1),\n",
       " ('meet', 12),\n",
       " ('goldfinch', 1),\n",
       " ('tavern', 1),\n",
       " ('ethan', 1),\n",
       " ('stowell', 1),\n",
       " ('let', 4),\n",
       " ('chefs', 2),\n",
       " ('show', 4),\n",
       " ('flavours', 1),\n",
       " ('favourite', 1),\n",
       " ('soak', 3),\n",
       " ('scene', 6),\n",
       " ('living', 22),\n",
       " ('mix', 2),\n",
       " ('live', 10),\n",
       " ('dj', 1),\n",
       " ('series', 1),\n",
       " ('before', 3),\n",
       " ('heading', 3),\n",
       " ('memorable', 4),\n",
       " ('trace', 1),\n",
       " ('seasonal', 10),\n",
       " ('fare', 4),\n",
       " ('atmosphere', 7),\n",
       " ('missed', 1),\n",
       " ('work', 25),\n",
       " ('off', 19),\n",
       " ('next', 20),\n",
       " ('morning', 18),\n",
       " ('fit', 7),\n",
       " ('wandering', 1),\n",
       " ('always', 7),\n",
       " ('we', 128),\n",
       " ('ve', 7),\n",
       " ('got', 4),\n",
       " ('during', 9),\n",
       " ('time', 24),\n",
       " ('whatever', 2),\n",
       " ('whenever', 1),\n",
       " ('wish', 2),\n",
       " ('command', 1),\n",
       " ('upscale', 11),\n",
       " ('getaway', 6),\n",
       " ('hyatt', 15),\n",
       " ('theater', 6),\n",
       " ('four', 9),\n",
       " ('diamond', 4),\n",
       " ('landmarks', 6),\n",
       " ('destinations', 3),\n",
       " ('scenic', 4),\n",
       " ('luxuriate', 1),\n",
       " ('opt', 2),\n",
       " ('decadent', 1),\n",
       " ('suite', 25),\n",
       " ('bath', 10),\n",
       " ('upgraded', 2),\n",
       " ('access', 59),\n",
       " ('ever', 3),\n",
       " ('had', 1),\n",
       " ('reading', 3),\n",
       " ('couldn', 3),\n",
       " ('put', 5),\n",
       " ('down', 6),\n",
       " ('but', 12),\n",
       " ('never', 4),\n",
       " ('wanted', 2),\n",
       " ('end', 5),\n",
       " ('know', 8),\n",
       " ('kimpton', 5),\n",
       " ('alexis', 3),\n",
       " ('1901', 1),\n",
       " ('building', 19),\n",
       " ('close', 34),\n",
       " ('enough', 5),\n",
       " ('smell', 1),\n",
       " ('sea', 23),\n",
       " ('air', 11),\n",
       " ('plot', 1),\n",
       " ('peaceful', 2),\n",
       " ('sanctuary', 2),\n",
       " ('den', 1),\n",
       " ('mixed', 1),\n",
       " ('matched', 3),\n",
       " ('characters', 1),\n",
       " ('attentive', 3),\n",
       " ('members', 1),\n",
       " ('who', 5),\n",
       " ('seem', 1),\n",
       " ('plus', 15),\n",
       " ('fellow', 2),\n",
       " ('interesting', 3),\n",
       " ('stories', 2),\n",
       " ('tell', 4),\n",
       " ('ending', 1),\n",
       " ('without', 4),\n",
       " ('giving', 3),\n",
       " ('easy', 44),\n",
       " ('perennial', 1),\n",
       " ('seller', 1),\n",
       " ('positioned', 1),\n",
       " ('edge', 3),\n",
       " ('borders', 1),\n",
       " ('retail', 4),\n",
       " ('guestrooms', 15),\n",
       " ('turntables', 1),\n",
       " ('vinyl', 1),\n",
       " ('max', 1),\n",
       " ('dedicated', 5),\n",
       " ('lovers', 2),\n",
       " ('indulge', 3),\n",
       " ('provenance', 1),\n",
       " ('locally', 7),\n",
       " ('influenced', 1),\n",
       " ('honor', 1),\n",
       " ('beer', 5),\n",
       " ('miller', 1),\n",
       " ('guild', 1),\n",
       " ('fell', 1),\n",
       " ('former', 1),\n",
       " ('maritime', 1),\n",
       " ('workers', 1),\n",
       " ('started', 1),\n",
       " ('first', 19),\n",
       " ('1999', 1),\n",
       " ('roots', 1),\n",
       " ('unfussy', 1),\n",
       " ('intentional', 1),\n",
       " ('design', 5),\n",
       " ('ethos', 1),\n",
       " ('drive', 12),\n",
       " ('loft', 3),\n",
       " ('ceilings', 3),\n",
       " ('hardwood', 1),\n",
       " ('floors', 7),\n",
       " ('wherever', 1),\n",
       " ('could', 5),\n",
       " ('preserve', 1),\n",
       " ('them', 3),\n",
       " ('friends', 11),\n",
       " ('kaws', 1),\n",
       " ('shepard', 1),\n",
       " ('fairey', 1),\n",
       " ('were', 2),\n",
       " ('elements', 3),\n",
       " ('hotelier', 1),\n",
       " ('map', 1),\n",
       " ('still', 3),\n",
       " ('touch', 1),\n",
       " ('point', 1),\n",
       " ('ace', 1),\n",
       " ('today', 16),\n",
       " ('when', 38),\n",
       " ('marriott', 5),\n",
       " ('reveals', 1),\n",
       " ('mountains', 9),\n",
       " ('famous', 15),\n",
       " ('elevator', 2),\n",
       " ('ride', 9),\n",
       " ('sit', 3),\n",
       " ('adjacent', 4),\n",
       " ('harbor', 4),\n",
       " ('also', 62),\n",
       " ('port', 4),\n",
       " ('aquarium', 10),\n",
       " ('westlake', 5),\n",
       " ('olympic', 13),\n",
       " ('sculpture', 4),\n",
       " ('park', 35),\n",
       " ('outfitted', 3),\n",
       " ('plush', 13),\n",
       " ('bedding', 9),\n",
       " ('mini', 7),\n",
       " ('refrigerators', 5),\n",
       " ('large', 11),\n",
       " ('desks', 5),\n",
       " ('wi', 37),\n",
       " ('fi', 37),\n",
       " ('balconies', 3),\n",
       " ('junior', 1),\n",
       " ('provide', 12),\n",
       " ('perfect', 30),\n",
       " ('extended', 15),\n",
       " ('stays', 7),\n",
       " ('reimagined', 2),\n",
       " ('its', 21),\n",
       " ('special', 13),\n",
       " ('perks', 3),\n",
       " ('indoor', 16),\n",
       " ('gym', 3),\n",
       " ('delicious', 8),\n",
       " ('gastropub', 1),\n",
       " ('tempting', 1),\n",
       " ('libations', 1),\n",
       " ('found', 5),\n",
       " ('popular', 12),\n",
       " ('look', 4),\n",
       " ('no', 15),\n",
       " ('further', 2),\n",
       " ('10', 11),\n",
       " ('redesigned', 1),\n",
       " ('venues', 8),\n",
       " ('well', 33),\n",
       " ('supported', 1),\n",
       " ('edgewater', 2),\n",
       " ('reported', 1),\n",
       " ('cnbc', 1),\n",
       " ('amazing', 1),\n",
       " ('breathtaking', 4),\n",
       " ('sunset', 3),\n",
       " ('ship', 5),\n",
       " ('terminals', 5),\n",
       " ('sites', 4),\n",
       " ('decide', 1),\n",
       " ('turn', 2),\n",
       " ('tub', 8),\n",
       " ('water', 10),\n",
       " ('67', 1),\n",
       " ('dynamic', 2),\n",
       " ('soul', 1),\n",
       " ('lodging', 6),\n",
       " ('river', 1),\n",
       " ('rock', 3),\n",
       " ('fireplaces', 3),\n",
       " ('wilderness', 2),\n",
       " ('landscapes', 1),\n",
       " ('outside', 13),\n",
       " ('window', 1),\n",
       " ('treat', 1),\n",
       " ('yourself', 8),\n",
       " ('rewarding', 1),\n",
       " ('springhill', 2),\n",
       " ('south', 19),\n",
       " ('goal', 4),\n",
       " ('whether', 23),\n",
       " ('night', 13),\n",
       " ('weekend', 2),\n",
       " ('each', 28),\n",
       " ('spacious', 29),\n",
       " ('separate', 6),\n",
       " ('kitchenette', 3),\n",
       " ('fridge', 9),\n",
       " ('maker', 7),\n",
       " ('microwave', 12),\n",
       " ('convenience', 18),\n",
       " ('every', 29),\n",
       " ('onsite', 8),\n",
       " ('bistro', 3),\n",
       " ('yale', 1),\n",
       " ('innovative', 5),\n",
       " ('additional', 4),\n",
       " ('highlights', 1),\n",
       " ('include', 19),\n",
       " ('swimming', 10),\n",
       " ('so', 12),\n",
       " ('pamper', 2),\n",
       " ('last', 2),\n",
       " ('least', 1),\n",
       " ('makes', 5),\n",
       " ('mall', 11),\n",
       " ('other', 21),\n",
       " ('plan', 6),\n",
       " ('forward', 1),\n",
       " ('seeing', 1),\n",
       " ('premier', 2),\n",
       " ('fairmont', 3),\n",
       " ('captures', 1),\n",
       " ('old', 7),\n",
       " ('elegance', 4),\n",
       " ('italian', 2),\n",
       " ('renaissance', 2),\n",
       " ('built', 14),\n",
       " ('1924', 1),\n",
       " ('legendary', 2),\n",
       " ('architecture', 1),\n",
       " ('acclaimed', 3),\n",
       " ('impeccable', 1),\n",
       " ('corridors', 2),\n",
       " ('full', 27),\n",
       " ('shines', 1),\n",
       " ('named', 4),\n",
       " ('news', 3),\n",
       " ('report', 3),\n",
       " ('2018', 3),\n",
       " ('hub', 6),\n",
       " ('retreat', 6),\n",
       " ('cozily', 2),\n",
       " ('activity', 2),\n",
       " ('streets', 2),\n",
       " ('lined', 2),\n",
       " ('diversified', 1),\n",
       " ('sophisticated', 4),\n",
       " ('chic', 5),\n",
       " ('excursion', 2),\n",
       " ('afternoon', 4),\n",
       " ('gasworks', 1),\n",
       " ('quiet', 11),\n",
       " ('beauty', 5),\n",
       " ('have', 35),\n",
       " ('instant', 2),\n",
       " ('both', 15),\n",
       " ('worlds', 1),\n",
       " ('travel', 25),\n",
       " ('pleasure', 8),\n",
       " ('trips', 1),\n",
       " ('few', 16),\n",
       " ('corporate', 10),\n",
       " ('vacationers', 2),\n",
       " ('museums', 6),\n",
       " ('less', 11),\n",
       " ('landmark', 9),\n",
       " ('distinctly', 1),\n",
       " ('charming', 1),\n",
       " ('unmistakable', 1),\n",
       " ('sprawling', 2),\n",
       " ('system', 5),\n",
       " ('shows', 3),\n",
       " ('pristine', 2),\n",
       " ('outdoors', 3),\n",
       " ('performing', 1),\n",
       " ('cultural', 7),\n",
       " ('thriving', 6),\n",
       " ('visitor', 1),\n",
       " ('metro', 4),\n",
       " ('attracts', 1),\n",
       " ('deal', 1),\n",
       " ('professional', 5),\n",
       " ('travelers', 23),\n",
       " ('booming', 1),\n",
       " ('fortune', 2),\n",
       " ('500', 4),\n",
       " ('costco', 1),\n",
       " ('wholesale', 1),\n",
       " ('microsoft', 10),\n",
       " ('facebook', 5),\n",
       " ('furthermore', 1),\n",
       " ('fans', 3),\n",
       " ('athletic', 3),\n",
       " ('three', 11),\n",
       " ('teams', 2),\n",
       " ('nestled', 4),\n",
       " ('embassy', 2),\n",
       " ('sleeping', 5),\n",
       " ('queen', 28),\n",
       " ('size', 6),\n",
       " ('sofa', 7),\n",
       " ('bed', 41),\n",
       " ('50', 2),\n",
       " ('inch', 11),\n",
       " ('hdtv', 7),\n",
       " ('kitchenettes', 6),\n",
       " ('dine', 6),\n",
       " ('institution', 1),\n",
       " ('13', 1),\n",
       " ('coins', 1),\n",
       " ('hand', 6),\n",
       " ('zephyr', 1),\n",
       " ('stop', 12),\n",
       " ('health', 4),\n",
       " ('includes', 8),\n",
       " ('heated', 12),\n",
       " ('hot', 25),\n",
       " ('sun', 2),\n",
       " ('deck', 5),\n",
       " ('begin', 3),\n",
       " ('free', 123),\n",
       " ('made', 5),\n",
       " ('order', 3),\n",
       " ('evening', 8),\n",
       " ('55', 4),\n",
       " ('cheer', 2),\n",
       " ('team', 6),\n",
       " ('football', 3),\n",
       " ('fc', 2),\n",
       " ('baseball', 2),\n",
       " ('game', 7),\n",
       " ('mobile', 4),\n",
       " ('trip', 11),\n",
       " ('around', 14),\n",
       " ('boat', 2),\n",
       " ('wamu', 1),\n",
       " ('pan', 1),\n",
       " ('trust', 2),\n",
       " ('settle', 4),\n",
       " ('into', 13),\n",
       " ('epicentre', 1),\n",
       " ('wondering', 1),\n",
       " ('vintage', 2),\n",
       " ('spot', 2),\n",
       " ('corner', 6),\n",
       " ('spring', 3),\n",
       " ('financial', 2),\n",
       " ('spots', 3),\n",
       " ('benaroya', 2),\n",
       " ('hall', 5),\n",
       " ('trade', 2),\n",
       " ('transportation', 7),\n",
       " ('easily', 7),\n",
       " ('walkable', 3),\n",
       " ('ferry', 3),\n",
       " ('dozens', 2),\n",
       " ('buses', 2),\n",
       " ('driving', 3),\n",
       " ('those', 4),\n",
       " ('coming', 2),\n",
       " ('tac', 15),\n",
       " ...]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "words_freq = [(word, sum_words[0,idx]) for word,idx in vec.vocabulary_.items()]  # 得到词及对应出现的次数\n",
    "words_freq"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('the', 1258),\n",
       " ('and', 1062),\n",
       " ('of', 536),\n",
       " ('seattle', 533),\n",
       " ('to', 471),\n",
       " ('in', 449),\n",
       " ('our', 359),\n",
       " ('you', 304),\n",
       " ('hotel', 295),\n",
       " ('with', 280),\n",
       " ('is', 271),\n",
       " ('at', 231),\n",
       " ('from', 224),\n",
       " ('for', 216),\n",
       " ('your', 186),\n",
       " ('or', 161),\n",
       " ('center', 151),\n",
       " ('are', 136),\n",
       " ('downtown', 133),\n",
       " ('on', 129),\n",
       " ('we', 128),\n",
       " ('free', 123),\n",
       " ('as', 117),\n",
       " ('located', 108),\n",
       " ('rooms', 106),\n",
       " ('stay', 105),\n",
       " ('place', 102),\n",
       " ('all', 100),\n",
       " ('airport', 99),\n",
       " ('space', 97),\n",
       " ('market', 97),\n",
       " ('enjoy', 93),\n",
       " ('an', 91),\n",
       " ('pike', 90),\n",
       " ('inn', 89),\n",
       " ('business', 87),\n",
       " ('just', 82),\n",
       " ('city', 79),\n",
       " ('room', 77),\n",
       " ('one', 75),\n",
       " ('by', 71),\n",
       " ('breakfast', 68),\n",
       " ('needle', 68),\n",
       " ('suites', 67),\n",
       " ('washington', 67),\n",
       " ('that', 65),\n",
       " ('re', 64),\n",
       " ('this', 63),\n",
       " ('complimentary', 62),\n",
       " ('also', 62),\n",
       " ('amenities', 60),\n",
       " ('offer', 59),\n",
       " ('attractions', 59),\n",
       " ('away', 59),\n",
       " ('access', 59),\n",
       " ('home', 57),\n",
       " ('guest', 57),\n",
       " ('can', 55),\n",
       " ('it', 55),\n",
       " ('guests', 54),\n",
       " ('service', 53),\n",
       " ('experience', 52),\n",
       " ('area', 51),\n",
       " ('best', 50),\n",
       " ('ll', 48),\n",
       " ('near', 48),\n",
       " ('including', 47),\n",
       " ('like', 46),\n",
       " ('will', 46),\n",
       " ('local', 45),\n",
       " ('art', 44),\n",
       " ('up', 44),\n",
       " ('easy', 44),\n",
       " ('make', 43),\n",
       " ('museum', 43),\n",
       " ('offers', 43),\n",
       " ('be', 43),\n",
       " ('minutes', 43),\n",
       " ('university', 43),\n",
       " ('pacific', 42),\n",
       " ('northwest', 42),\n",
       " ('24', 42),\n",
       " ('fitness', 42),\n",
       " ('lake', 41),\n",
       " ('has', 41),\n",
       " ('bed', 41),\n",
       " ('most', 40),\n",
       " ('views', 39),\n",
       " ('great', 39),\n",
       " ('day', 39),\n",
       " ('waterfront', 38),\n",
       " ('coffee', 38),\n",
       " ('when', 38),\n",
       " ('pool', 37),\n",
       " ('wi', 37),\n",
       " ('fi', 37),\n",
       " ('dining', 36),\n",
       " ('available', 36),\n",
       " ('within', 36),\n",
       " ('neighborhood', 35),\n",
       " ('restaurants', 35),\n",
       " ('heart', 35),\n",
       " ('park', 35),\n",
       " ('have', 35),\n",
       " ('bar', 34),\n",
       " ('only', 34),\n",
       " ('modern', 34),\n",
       " ('while', 34),\n",
       " ('field', 34),\n",
       " ('close', 34),\n",
       " ('union', 33),\n",
       " ('location', 33),\n",
       " ('well', 33),\n",
       " ('international', 32),\n",
       " ('hour', 32),\n",
       " ('restaurant', 32),\n",
       " ('high', 32),\n",
       " ('parking', 32),\n",
       " ('take', 31),\n",
       " ('find', 31),\n",
       " ('shopping', 31),\n",
       " ('many', 31),\n",
       " ('shuttle', 31),\n",
       " ('state', 30),\n",
       " ('comfortable', 30),\n",
       " ('perfect', 30),\n",
       " ('event', 29),\n",
       " ('private', 29),\n",
       " ('more', 29),\n",
       " ('friendly', 29),\n",
       " ('spacious', 29),\n",
       " ('every', 29),\n",
       " ('hotels', 28),\n",
       " ('square', 28),\n",
       " ('walk', 28),\n",
       " ('each', 28),\n",
       " ('queen', 28),\n",
       " ('meeting', 27),\n",
       " ('features', 27),\n",
       " ('style', 27),\n",
       " ('where', 27),\n",
       " ('district', 27),\n",
       " ('full', 27),\n",
       " ('light', 26),\n",
       " ('featuring', 26),\n",
       " ('lobby', 26),\n",
       " ('internet', 26),\n",
       " ('street', 26),\n",
       " ('two', 26),\n",
       " ('such', 26),\n",
       " ('speed', 26),\n",
       " ('tacoma', 26),\n",
       " ('need', 25),\n",
       " ('relax', 25),\n",
       " ('work', 25),\n",
       " ('suite', 25),\n",
       " ('travel', 25),\n",
       " ('hot', 25),\n",
       " ('wifi', 25),\n",
       " ('family', 25),\n",
       " ('km', 25),\n",
       " ('world', 24),\n",
       " ('comfort', 24),\n",
       " ('historic', 24),\n",
       " ('convention', 24),\n",
       " ('time', 24),\n",
       " ('miles', 24),\n",
       " ('out', 23),\n",
       " ('outdoor', 23),\n",
       " ('distance', 23),\n",
       " ('sea', 23),\n",
       " ('whether', 23),\n",
       " ('travelers', 23),\n",
       " ('some', 22),\n",
       " ('living', 22),\n",
       " ('flat', 22),\n",
       " ('us', 21),\n",
       " ('sound', 21),\n",
       " ('offering', 21),\n",
       " ('designed', 21),\n",
       " ('walking', 21),\n",
       " ('visit', 21),\n",
       " ('blocks', 21),\n",
       " ('its', 21),\n",
       " ('other', 21),\n",
       " ('hill', 21),\n",
       " ('lounge', 20),\n",
       " ('not', 20),\n",
       " ('safeco', 20),\n",
       " ('explore', 20),\n",
       " ('steps', 20),\n",
       " ('welcome', 20),\n",
       " ('next', 20),\n",
       " ('feature', 20),\n",
       " ('site', 20),\n",
       " ('sure', 20),\n",
       " ('amazon', 19),\n",
       " ('here', 19),\n",
       " ('than', 19),\n",
       " ('if', 19),\n",
       " ('rail', 19),\n",
       " ('off', 19),\n",
       " ('building', 19),\n",
       " ('first', 19),\n",
       " ('south', 19),\n",
       " ('include', 19),\n",
       " ('property', 19),\n",
       " ('wa', 19),\n",
       " ('motel', 19),\n",
       " ('leisure', 18),\n",
       " ('everything', 18),\n",
       " ('luxury', 18),\n",
       " ('fresh', 18),\n",
       " ('morning', 18),\n",
       " ('convenience', 18),\n",
       " ('through', 18),\n",
       " ('minute', 18),\n",
       " ('new', 17),\n",
       " ('club', 17),\n",
       " ('stylish', 17),\n",
       " ('kitchen', 17),\n",
       " ('beds', 17),\n",
       " ('come', 17),\n",
       " ('emerald', 17),\n",
       " ('centurylink', 17),\n",
       " ('top', 17),\n",
       " ('short', 17),\n",
       " ('equipped', 17),\n",
       " ('tv', 17),\n",
       " ('food', 17),\n",
       " ('screen', 17),\n",
       " ('nearby', 16),\n",
       " ('puget', 16),\n",
       " ('meetings', 16),\n",
       " ('feet', 16),\n",
       " ('across', 16),\n",
       " ('music', 16),\n",
       " ('unique', 16),\n",
       " ('today', 16),\n",
       " ('indoor', 16),\n",
       " ('few', 16),\n",
       " ('convenient', 16),\n",
       " ('capitol', 16),\n",
       " ('fun', 16),\n",
       " ('traveling', 16),\n",
       " ('flight', 16),\n",
       " ('iconic', 15),\n",
       " ('west', 15),\n",
       " ('avenue', 15),\n",
       " ('see', 15),\n",
       " ('pioneer', 15),\n",
       " ('accommodations', 15),\n",
       " ('right', 15),\n",
       " ('urban', 15),\n",
       " ('there', 15),\n",
       " ('hyatt', 15),\n",
       " ('plus', 15),\n",
       " ('guestrooms', 15),\n",
       " ('famous', 15),\n",
       " ('extended', 15),\n",
       " ('no', 15),\n",
       " ('both', 15),\n",
       " ('tac', 15),\n",
       " ('anne', 15),\n",
       " ('over', 14),\n",
       " ('vibrant', 14),\n",
       " ('north', 14),\n",
       " ('options', 14),\n",
       " ('mile', 14),\n",
       " ('which', 14),\n",
       " ('situated', 14),\n",
       " ('comforts', 14),\n",
       " ('signature', 14),\n",
       " ('shops', 14),\n",
       " ('bay', 14),\n",
       " ('built', 14),\n",
       " ('around', 14),\n",
       " ('start', 14),\n",
       " ('conveniently', 14),\n",
       " ('events', 14),\n",
       " ('continental', 14),\n",
       " ('grand', 13),\n",
       " ('class', 13),\n",
       " ('award', 13),\n",
       " ('spa', 13),\n",
       " ('sports', 13),\n",
       " ('town', 13),\n",
       " ('belltown', 13),\n",
       " ('along', 13),\n",
       " ('olympic', 13),\n",
       " ('plush', 13),\n",
       " ('special', 13),\n",
       " ('outside', 13),\n",
       " ('night', 13),\n",
       " ('into', 13),\n",
       " ('after', 13),\n",
       " ('facilities', 13),\n",
       " ('long', 13),\n",
       " ('pet', 13),\n",
       " ('medical', 13),\n",
       " ('hilton', 12),\n",
       " ('major', 12),\n",
       " ('variety', 12),\n",
       " ('starbucks', 12),\n",
       " ('relaxing', 12),\n",
       " ('advantage', 12),\n",
       " ('central', 12),\n",
       " ('contemporary', 12),\n",
       " ('areas', 12),\n",
       " ('five', 12),\n",
       " ('any', 12),\n",
       " ('inspired', 12),\n",
       " ('staying', 12),\n",
       " ('meet', 12),\n",
       " ('but', 12),\n",
       " ('drive', 12),\n",
       " ('provide', 12),\n",
       " ('popular', 12),\n",
       " ('microwave', 12),\n",
       " ('so', 12),\n",
       " ('stop', 12),\n",
       " ('heated', 12),\n",
       " ('bathroom', 12),\n",
       " ('house', 12),\n",
       " ('laundry', 12),\n",
       " ('beautiful', 12),\n",
       " ('seatac', 12),\n",
       " ('garden', 11),\n",
       " ('000', 11),\n",
       " ('unwind', 11),\n",
       " ('cuisine', 11),\n",
       " ('step', 11),\n",
       " ('front', 11),\n",
       " ('entertainment', 11),\n",
       " ('book', 11),\n",
       " ('they', 11),\n",
       " ('spaces', 11),\n",
       " ('winning', 11),\n",
       " ('help', 11),\n",
       " ('get', 11),\n",
       " ('what', 11),\n",
       " ('view', 11),\n",
       " ('floor', 11),\n",
       " ('original', 11),\n",
       " ('about', 11),\n",
       " ('cruise', 11),\n",
       " ('social', 11),\n",
       " ('studio', 11),\n",
       " ('feel', 11),\n",
       " ('want', 11),\n",
       " ('upscale', 11),\n",
       " ('air', 11),\n",
       " ('friends', 11),\n",
       " ('large', 11),\n",
       " ('10', 11),\n",
       " ('mall', 11),\n",
       " ('quiet', 11),\n",
       " ('less', 11),\n",
       " ('three', 11),\n",
       " ('inch', 11),\n",
       " ('trip', 11),\n",
       " ('link', 11),\n",
       " ('check', 11),\n",
       " ('hospitality', 11),\n",
       " ('back', 11),\n",
       " ('cable', 11),\n",
       " ('was', 11),\n",
       " ('boeing', 11),\n",
       " ('visiting', 11),\n",
       " ('desk', 11),\n",
       " ('glass', 10),\n",
       " ('daily', 10),\n",
       " ('seahawks', 10),\n",
       " ('renovated', 10),\n",
       " ('wine', 10),\n",
       " ('cozy', 10),\n",
       " ('charm', 10),\n",
       " ('inviting', 10),\n",
       " ('block', 10),\n",
       " ('elliott', 10),\n",
       " ('then', 10),\n",
       " ('live', 10),\n",
       " ('seasonal', 10),\n",
       " ('bath', 10),\n",
       " ('aquarium', 10),\n",
       " ('water', 10),\n",
       " ('swimming', 10),\n",
       " ('corporate', 10),\n",
       " ('microsoft', 10),\n",
       " ('fully', 10),\n",
       " ('history', 10),\n",
       " ('ideal', 10),\n",
       " ('furnishings', 10),\n",
       " ('tvs', 10),\n",
       " ('hours', 10),\n",
       " ('campus', 10),\n",
       " ('door', 10),\n",
       " ('hospital', 10),\n",
       " ('mason', 10),\n",
       " ('staff', 9),\n",
       " ('snacks', 9),\n",
       " ('provides', 9),\n",
       " ('wireless', 9),\n",
       " ('mariners', 9),\n",
       " ('range', 9),\n",
       " ('host', 9),\n",
       " ('love', 9),\n",
       " ('appointed', 9),\n",
       " ('spend', 9),\n",
       " ('during', 9),\n",
       " ('four', 9),\n",
       " ('mountains', 9),\n",
       " ('ride', 9),\n",
       " ('bedding', 9),\n",
       " ('fridge', 9),\n",
       " ('landmark', 9),\n",
       " ('20', 9),\n",
       " ('use', 9),\n",
       " ('vacation', 9),\n",
       " ('choice', 9),\n",
       " ('community', 9),\n",
       " ('businesses', 9),\n",
       " ('value', 9),\n",
       " ('bedroom', 9),\n",
       " ('windows', 9),\n",
       " ('days', 9),\n",
       " ('courtyard', 9),\n",
       " ('catch', 9),\n",
       " ('extra', 9),\n",
       " ('red', 9),\n",
       " ('road', 9),\n",
       " ('virginia', 9),\n",
       " ('mansion', 9),\n",
       " ('eclectic', 8),\n",
       " ('proximity', 8),\n",
       " ('activities', 8),\n",
       " ('natural', 8),\n",
       " ('sheraton', 8),\n",
       " ('doors', 8),\n",
       " ('small', 8),\n",
       " ('skyline', 8),\n",
       " ('planning', 8),\n",
       " ('catering', 8),\n",
       " ('mind', 8),\n",
       " ('boutique', 8),\n",
       " ('warm', 8),\n",
       " ('rooftop', 8),\n",
       " ('rich', 8),\n",
       " ('known', 8),\n",
       " ('classic', 8),\n",
       " ('know', 8),\n",
       " ('delicious', 8),\n",
       " ('venues', 8),\n",
       " ('tub', 8),\n",
       " ('yourself', 8),\n",
       " ('onsite', 8),\n",
       " ('pleasure', 8),\n",
       " ('includes', 8),\n",
       " ('evening', 8),\n",
       " ('directly', 8),\n",
       " ('alike', 8),\n",
       " ('cancer', 8),\n",
       " ('traveler', 8),\n",
       " ('stadium', 8),\n",
       " ('even', 8),\n",
       " ('escape', 8),\n",
       " ('king', 8),\n",
       " ('refrigerator', 8),\n",
       " ('premium', 8),\n",
       " ('budget', 8),\n",
       " ('facility', 8),\n",
       " ('renton', 8),\n",
       " ('alfred', 8),\n",
       " ('silver', 8),\n",
       " ('cloud', 8),\n",
       " ('bars', 7),\n",
       " ('reception', 7),\n",
       " ('dinner', 7),\n",
       " ('exciting', 7),\n",
       " ('recently', 7),\n",
       " ('favorite', 7),\n",
       " ('inside', 7),\n",
       " ('stunning', 7),\n",
       " ('throughout', 7),\n",
       " ('workout', 7),\n",
       " ('library', 7),\n",
       " ('gorgeous', 7),\n",
       " ('wood', 7),\n",
       " ('seating', 7),\n",
       " ('among', 7),\n",
       " ('car', 7),\n",
       " ('arts', 7),\n",
       " ('atmosphere', 7),\n",
       " ('fit', 7),\n",
       " ('always', 7),\n",
       " ('ve', 7),\n",
       " ('locally', 7),\n",
       " ('floors', 7),\n",
       " ('mini', 7),\n",
       " ('stays', 7),\n",
       " ('maker', 7),\n",
       " ('old', 7),\n",
       " ('cultural', 7),\n",
       " ('sofa', 7),\n",
       " ('hdtv', 7),\n",
       " ('game', 7),\n",
       " ('transportation', 7),\n",
       " ('easily', 7),\n",
       " ('open', 7),\n",
       " ('executive', 7),\n",
       " ('apartment', 7),\n",
       " ('kitchens', 7),\n",
       " ('been', 7),\n",
       " ('non', 7),\n",
       " ('bathrooms', 7),\n",
       " ('table', 7),\n",
       " ('hostel', 7),\n",
       " ('roof', 7),\n",
       " ('ballard', 7),\n",
       " ('proud', 7),\n",
       " ('station', 7),\n",
       " ('services', 7),\n",
       " ('european', 7),\n",
       " ('southcenter', 7),\n",
       " ('people', 7),\n",
       " ('100', 7),\n",
       " ('needs', 7),\n",
       " ('looking', 7),\n",
       " ('don', 7),\n",
       " ('15', 7),\n",
       " ('shoreline', 7),\n",
       " ('much', 7),\n",
       " ('broadway', 7),\n",
       " ('perfectly', 6),\n",
       " ('companies', 6),\n",
       " ('google', 6),\n",
       " ('conference', 6),\n",
       " ('cocktail', 6),\n",
       " ('drinks', 6),\n",
       " ('seven', 6),\n",
       " ('collection', 6),\n",
       " ('sleep', 6),\n",
       " ('savor', 6),\n",
       " ('happy', 6),\n",
       " ('public', 6),\n",
       " ('nordstrom', 6),\n",
       " ('terminal', 6),\n",
       " ('spectacular', 6),\n",
       " ('custom', 6),\n",
       " ('simple', 6),\n",
       " ('fireplace', 6),\n",
       " ('wide', 6),\n",
       " ('culture', 6),\n",
       " ('patio', 6),\n",
       " ('join', 6),\n",
       " ('setting', 6),\n",
       " ('wake', 6),\n",
       " ('cup', 6),\n",
       " ('hiking', 6),\n",
       " ('scene', 6),\n",
       " ('getaway', 6),\n",
       " ('theater', 6),\n",
       " ('landmarks', 6),\n",
       " ('down', 6),\n",
       " ('lodging', 6),\n",
       " ('separate', 6),\n",
       " ('plan', 6),\n",
       " ('hub', 6),\n",
       " ('retreat', 6),\n",
       " ('museums', 6),\n",
       " ('thriving', 6),\n",
       " ('size', 6),\n",
       " ('kitchenettes', 6),\n",
       " ('dine', 6),\n",
       " ('hand', 6),\n",
       " ('team', 6),\n",
       " ('corner', 6),\n",
       " ('do', 6),\n",
       " ('ready', 6),\n",
       " ('centers', 6),\n",
       " ('own', 6),\n",
       " ('experiences', 6),\n",
       " ('complete', 6),\n",
       " ('brings', 6),\n",
       " ('furnished', 6),\n",
       " ('ideally', 6),\n",
       " ('kind', 6),\n",
       " ('creative', 6),\n",
       " ('discover', 6),\n",
       " ('television', 6),\n",
       " ('bus', 6),\n",
       " ('homewood', 6),\n",
       " ('nightlife', 6),\n",
       " ('regency', 6),\n",
       " ('level', 6),\n",
       " ('go', 6),\n",
       " ('flexible', 6),\n",
       " ('grill', 6),\n",
       " ('key', 6),\n",
       " ('mt', 6),\n",
       " ('rainier', 6),\n",
       " ('buffet', 6),\n",
       " ('zoo', 6),\n",
       " ('matter', 6),\n",
       " ('headquarters', 6),\n",
       " ('play', 6),\n",
       " ('shower', 6),\n",
       " ('westfield', 6),\n",
       " ('quick', 6),\n",
       " ('smoking', 6),\n",
       " ('channels', 6),\n",
       " ('rate', 6),\n",
       " ('tea', 6),\n",
       " ('enjoyable', 6),\n",
       " ('research', 6),\n",
       " ('busy', 6),\n",
       " ('fred', 6),\n",
       " ('hutchinson', 6),\n",
       " ('very', 6),\n",
       " ('apartments', 6),\n",
       " ('resort', 6),\n",
       " ('gates', 5),\n",
       " ('locals', 5),\n",
       " ('american', 5),\n",
       " ('core', 5),\n",
       " ('diverse', 5),\n",
       " ('america', 5),\n",
       " ('expanded', 5),\n",
       " ('enjoying', 5),\n",
       " ('beverage', 5),\n",
       " ('several', 5),\n",
       " ('crowne', 5),\n",
       " ('exceptional', 5),\n",
       " ('cool', 5),\n",
       " ('drink', 5),\n",
       " ('better', 5),\n",
       " ('might', 5),\n",
       " ('half', 5),\n",
       " ('must', 5),\n",
       " ('houses', 5),\n",
       " ('boutiques', 5),\n",
       " ('pier', 5),\n",
       " ('stadiums', 5),\n",
       " ('mountain', 5),\n",
       " ('newly', 5),\n",
       " ('sleek', 5),\n",
       " ('welcoming', 5),\n",
       " ('elegant', 5),\n",
       " ('residence', 5),\n",
       " ('puts', 5),\n",
       " ('tour', 5),\n",
       " ('via', 5),\n",
       " ('destination', 5),\n",
       " ('culinary', 5),\n",
       " ('cocktails', 5),\n",
       " ('between', 5),\n",
       " ('monorail', 5),\n",
       " ('warwick', 5),\n",
       " ('way', 5),\n",
       " ('possible', 5),\n",
       " ('head', 5),\n",
       " ('craft', 5),\n",
       " ('put', 5),\n",
       " ('end', 5),\n",
       " ('kimpton', 5),\n",
       " ('enough', 5),\n",
       " ('who', 5),\n",
       " ('dedicated', 5),\n",
       " ('beer', 5),\n",
       " ('design', 5),\n",
       " ('could', 5),\n",
       " ('marriott', 5),\n",
       " ('westlake', 5),\n",
       " ('refrigerators', 5),\n",
       " ('desks', 5),\n",
       " ('found', 5),\n",
       " ('ship', 5),\n",
       " ('terminals', 5),\n",
       " ('innovative', 5),\n",
       " ('makes', 5),\n",
       " ('chic', 5),\n",
       " ('beauty', 5),\n",
       " ('system', 5),\n",
       " ('professional', 5),\n",
       " ('facebook', 5),\n",
       " ('sleeping', 5),\n",
       " ('deck', 5),\n",
       " ('made', 5),\n",
       " ('hall', 5),\n",
       " ('30', 5),\n",
       " ('connected', 5),\n",
       " ('customer', 5),\n",
       " ('panel', 5),\n",
       " ('televisions', 5),\n",
       " ('wines', 5),\n",
       " ('course', 5),\n",
       " ('too', 5),\n",
       " ('list', 5),\n",
       " ('ensure', 5),\n",
       " ('holiday', 5),\n",
       " ('either', 5),\n",
       " ('districts', 5),\n",
       " ('fremont', 5),\n",
       " ('choose', 5),\n",
       " ('soft', 5),\n",
       " ('lifestyle', 5),\n",
       " ('adventure', 5),\n",
       " ('adventures', 5),\n",
       " ('simply', 5),\n",
       " ('these', 5),\n",
       " ('beverages', 5),\n",
       " ('forget', 5),\n",
       " ('store', 5),\n",
       " ('intimate', 5),\n",
       " ('receptions', 5),\n",
       " ('details', 5),\n",
       " ('century', 5),\n",
       " ('cities', 5),\n",
       " ('oasis', 5),\n",
       " ('communal', 5),\n",
       " ('express', 5),\n",
       " ('residential', 5),\n",
       " ('runs', 5),\n",
       " ('woodland', 5),\n",
       " ('personal', 5),\n",
       " ('yet', 5),\n",
       " ('bellevue', 5),\n",
       " ('interior', 5),\n",
       " ('healthy', 5),\n",
       " ('above', 5),\n",
       " ('stress', 5),\n",
       " ('term', 5),\n",
       " ('accessible', 5),\n",
       " ('run', 5),\n",
       " ('lodge', 5),\n",
       " ('filled', 5),\n",
       " ('boardroom', 5),\n",
       " ('green', 5),\n",
       " ('self', 5),\n",
       " ('units', 5),\n",
       " ('bustling', 5),\n",
       " ('college', 5),\n",
       " ('northgate', 5),\n",
       " ('round', 5),\n",
       " ('hit', 5),\n",
       " ('won', 5),\n",
       " ('ten', 5),\n",
       " ('pride', 5),\n",
       " ('landing', 5),\n",
       " ('linens', 5),\n",
       " ('ourselves', 5),\n",
       " ('bill', 4),\n",
       " ('melinda', 4),\n",
       " ('foundation', 4),\n",
       " ('visitors', 4),\n",
       " ('majestic', 4),\n",
       " ('sq', 4),\n",
       " ('ft', 4),\n",
       " ('technology', 4),\n",
       " ('wedding', 4),\n",
       " ('success', 4),\n",
       " ('refresh', 4),\n",
       " ('latest', 4),\n",
       " ('tastefully', 4),\n",
       " ('decorated', 4),\n",
       " ('productive', 4),\n",
       " ('lunch', 4),\n",
       " ('gateway', 4),\n",
       " ('plaza', 4),\n",
       " ('touches', 4),\n",
       " ('set', 4),\n",
       " ('lights', 4),\n",
       " ('addition', 4),\n",
       " ('rest', 4),\n",
       " ('5th', 4),\n",
       " ('sounders', 4),\n",
       " ('curated', 4),\n",
       " ('paramount', 4),\n",
       " ('luxurious', 4),\n",
       " ('comfy', 4),\n",
       " ('good', 4),\n",
       " ('why', 4),\n",
       " ('sightseeing', 4),\n",
       " ('leading', 4),\n",
       " ('refreshing', 4),\n",
       " ('covered', 4),\n",
       " ('surrounded', 4),\n",
       " ('trendy', 4),\n",
       " ('side', 4),\n",
       " ('let', 4),\n",
       " ('show', 4),\n",
       " ('memorable', 4),\n",
       " ('fare', 4),\n",
       " ('got', 4),\n",
       " ('diamond', 4),\n",
       " ('scenic', 4),\n",
       " ('never', 4),\n",
       " ('tell', 4),\n",
       " ('without', 4),\n",
       " ('retail', 4),\n",
       " ('adjacent', 4),\n",
       " ('harbor', 4),\n",
       " ('port', 4),\n",
       " ('sculpture', 4),\n",
       " ('look', 4),\n",
       " ('breathtaking', 4),\n",
       " ('sites', 4),\n",
       " ('goal', 4),\n",
       " ('additional', 4),\n",
       " ('elegance', 4),\n",
       " ('named', 4),\n",
       " ('sophisticated', 4),\n",
       " ('afternoon', 4),\n",
       " ('metro', 4),\n",
       " ('500', 4),\n",
       " ('nestled', 4),\n",
       " ('health', 4),\n",
       " ('55', 4),\n",
       " ('mobile', 4),\n",
       " ('settle', 4),\n",
       " ('those', 4),\n",
       " ('getting', 4),\n",
       " ('things', 4),\n",
       " ('science', 4),\n",
       " ('42', 4),\n",
       " ('keep', 4),\n",
       " ('watch', 4),\n",
       " ('coast', 4),\n",
       " ('bright', 4),\n",
       " ('plug', 4),\n",
       " ('selection', 4),\n",
       " ('stand', 4),\n",
       " ('unparalleled', 4),\n",
       " ('expansive', 4),\n",
       " ('energy', 4),\n",
       " ('ours', 4),\n",
       " ('4th', 4),\n",
       " ('would', 4),\n",
       " ('board', 4),\n",
       " ('active', 4),\n",
       " ('kick', 4),\n",
       " ('fuel', 4),\n",
       " ('western', 4),\n",
       " ('romantic', 4),\n",
       " ('clean', 4),\n",
       " ('same', 4),\n",
       " ('upgrade', 4),\n",
       " ('monday', 4),\n",
       " ('something', 4),\n",
       " ('casual', 4),\n",
       " ('marble', 4),\n",
       " ('12', 4),\n",
       " ('movie', 4),\n",
       " ('arena', 4),\n",
       " ('served', 4),\n",
       " ('accommodate', 4),\n",
       " ('week', 4),\n",
       " ('owned', 4),\n",
       " ('mediterranean', 4),\n",
       " ('delivers', 4),\n",
       " ('hip', 4),\n",
       " ('leave', 4),\n",
       " ('products', 4),\n",
       " ('twin', 4),\n",
       " ('early', 4),\n",
       " ('train', 4),\n",
       " ('exercise', 4),\n",
       " ('freshly', 4),\n",
       " ('menu', 4),\n",
       " ('function', 4),\n",
       " ('deluxe', 4),\n",
       " ('redmond', 4),\n",
       " ('snack', 4),\n",
       " ('seafood', 4),\n",
       " ('dip', 4),\n",
       " ('overnight', 4),\n",
       " ('greet', 4),\n",
       " ('relaxed', 4),\n",
       " ('exploring', 4),\n",
       " ('shop', 4),\n",
       " ('affordable', 4),\n",
       " ('give', 4),\n",
       " ('rewards', 4),\n",
       " ('program', 4),\n",
       " ('making', 4),\n",
       " ('pillow', 4),\n",
       " ('authentic', 4),\n",
       " ('inspiring', 4),\n",
       " ('how', 4),\n",
       " ('smoke', 4),\n",
       " ('405', 4),\n",
       " ('hampton', 4),\n",
       " ('kids', 4),\n",
       " ('everyone', 4),\n",
       " ('meals', 4),\n",
       " ('99', 4),\n",
       " ('11', 4),\n",
       " ('accommodation', 4),\n",
       " ('training', 4),\n",
       " ('beach', 4),\n",
       " ('points', 4),\n",
       " ('tours', 4),\n",
       " ('econo', 4),\n",
       " ('microwaves', 4),\n",
       " ('outlets', 4),\n",
       " ('little', 4),\n",
       " ('conditioned', 4),\n",
       " ('americas', 4),\n",
       " ('money', 4),\n",
       " ('historical', 4),\n",
       " ('year', 4),\n",
       " ('attention', 4),\n",
       " ('jimmy', 4),\n",
       " ('national', 4),\n",
       " ('theodore', 4),\n",
       " ('warmth', 4),\n",
       " ('maxwell', 4),\n",
       " ('creature', 4),\n",
       " ('definition', 4),\n",
       " ('bacon', 4),\n",
       " ('georgetown', 4),\n",
       " ('grove', 4),\n",
       " ('watertown', 4),\n",
       " ('gaslight', 4),\n",
       " ('oak', 4),\n",
       " ('southport', 4),\n",
       " ('allows', 3),\n",
       " ('kayaking', 3),\n",
       " ('versatile', 3),\n",
       " ('guarantee', 3),\n",
       " ('equipment', 3),\n",
       " ('chihuly', 3),\n",
       " ('earn', 3),\n",
       " ('chef', 3),\n",
       " ('blend', 3),\n",
       " ('soon', 3),\n",
       " ('concierge', 3),\n",
       " ('fantastic', 3),\n",
       " ('amid', 3),\n",
       " ('incredible', 3),\n",
       " ('haven', 3),\n",
       " ('prime', 3),\n",
       " ('crafted', 3),\n",
       " ('celebrated', 3),\n",
       " ('unforgettable', 3),\n",
       " ('reflect', 3),\n",
       " ('hosts', 3),\n",
       " ('company', 3),\n",
       " ('underground', 3),\n",
       " ('everywhere', 3),\n",
       " ('landscape', 3),\n",
       " ('region', 3),\n",
       " ('bounty', 3),\n",
       " ('truly', 3),\n",
       " ('pleasant', 3),\n",
       " ('breeze', 3),\n",
       " ('deep', 3),\n",
       " ('waters', 3),\n",
       " ('straight', 3),\n",
       " ('breweries', 3),\n",
       " ('mount', 3),\n",
       " ('soak', 3),\n",
       " ('before', 3),\n",
       " ('heading', 3),\n",
       " ('destinations', 3),\n",
       " ('ever', 3),\n",
       " ('reading', 3),\n",
       " ('couldn', 3),\n",
       " ('alexis', 3),\n",
       " ('matched', 3),\n",
       " ('attentive', 3),\n",
       " ('interesting', 3),\n",
       " ('giving', 3),\n",
       " ('edge', 3),\n",
       " ('indulge', 3),\n",
       " ('loft', 3),\n",
       " ('ceilings', 3),\n",
       " ('them', 3),\n",
       " ('elements', 3),\n",
       " ('still', 3),\n",
       " ('sit', 3),\n",
       " ('outfitted', 3),\n",
       " ('balconies', 3),\n",
       " ('perks', 3),\n",
       " ('gym', 3),\n",
       " ('sunset', 3),\n",
       " ('rock', 3),\n",
       " ('fireplaces', 3),\n",
       " ('kitchenette', 3),\n",
       " ('bistro', 3),\n",
       " ('fairmont', 3),\n",
       " ('acclaimed', 3),\n",
       " ('news', 3),\n",
       " ('report', 3),\n",
       " ('2018', 3),\n",
       " ('shows', 3),\n",
       " ('outdoors', 3),\n",
       " ('fans', 3),\n",
       " ('athletic', 3),\n",
       " ('begin', 3),\n",
       " ...]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "words_freq = sorted(words_freq, key=lambda x:x[1],reverse=True)  # 排序重复的次数\n",
    "words_freq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里重复最多的the我们并不是重要的信息词，后面我们需要进行怎样的优化呢"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_top_n_words(corpus, n=None):\n",
    "    # 获取某数据中最长出现的n个词\n",
    "    vec = CountVectorizer().fit(corpus)  # 寄存器\n",
    "    bag_of_words = vec.transform(corpus)  # 将文本转数值\n",
    "    sum_words = bag_of_words.sum(axis=0)  # 计算每个词重复的次数\n",
    "    words_freq = [(word, sum_words[0,idx]) for word,idx in vec.vocabulary_.items()]  # 得到词及对应出现的次数\n",
    "    words_freq = sorted(words_freq, key=lambda x:x[1],reverse=True)  # 排序重复的次数\n",
    "    return words_freq[:n]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('the', 1258),\n",
       " ('and', 1062),\n",
       " ('of', 536),\n",
       " ('seattle', 533),\n",
       " ('to', 471),\n",
       " ('in', 449),\n",
       " ('our', 359),\n",
       " ('you', 304),\n",
       " ('hotel', 295),\n",
       " ('with', 280),\n",
       " ('is', 271),\n",
       " ('at', 231),\n",
       " ('from', 224),\n",
       " ('for', 216),\n",
       " ('your', 186),\n",
       " ('or', 161),\n",
       " ('center', 151),\n",
       " ('are', 136),\n",
       " ('downtown', 133),\n",
       " ('on', 129)]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "common_words = get_top_n_words(df['desc'], 20)\n",
    "common_words"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>desc</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>the</td>\n",
       "      <td>1258</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>and</td>\n",
       "      <td>1062</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>of</td>\n",
       "      <td>536</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>seattle</td>\n",
       "      <td>533</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>to</td>\n",
       "      <td>471</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      desc  count\n",
       "0      the   1258\n",
       "1      and   1062\n",
       "2       of    536\n",
       "3  seattle    533\n",
       "4       to    471"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_common_words = pd.DataFrame(common_words, columns=['desc', 'count'])\n",
    "df_common_words.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 1.0, 'top 20')"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaUAAAEWCAYAAADGjIh1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3deZgdVbX38e/PEEJCJoaoYWzDG8QQJTHNlUjCDYIIqAhXFDAKATWPom9Arkziq9yLURQuIk4YEEEERJlEFCIXSYLMHcgIRKYwyWAYAiEYIKz3j9pNTprTU9Knq86p3+d5+uk6VfvUWbuLnMWu2rVKEYGZmVkRvC3vAMzMzFo5KZmZWWE4KZmZWWE4KZmZWWE4KZmZWWE4KZmZWWE4KZmZWWE4KZkVjKSlkvas0b4/Kulvkl6Q9JSkcyQNqtjeT9J5kl5M24+pRRxm7XFSMiuXIcB3gC2A9wBbAadVbD8ZGAlsC+wOHCdp716O0UrMScmsQCRdCGwD/FHSCknHpfX7SVqcRjizJL2n4j1LJZ0o6R5Jz0v6laSNqu0/Ii6OiOsiYmVEPA+cA+xa0eRQ4JSIeD4i7k3bp9Sou2Zv4aRkViAR8TngUeDjETEwIn4gaXvgEuBoYBjwZ7KktWHFWycDHwG2A7YHvtnFj9wNWAwgaROyEdT8iu3zgR3XvUdm3eOkZFZ8BwF/iojrI+I14HSgP/DBijY/iYjHIuI5YDpwSGc7lfRh4DDgW2nVwPR7eUWz5cAgzHqJk5JZ8W0BPNL6IiLeAB4Dtqxo81jF8iPpPe2StAtwMXBgRPw9rV6Rfg+uaDoYeGndwjbrPicls+JpW7r/H2QTDwCQJGBr4ImKNltXLG+T3lOVpLHA1cAREXHDmx+aXWN6EtipovlOpNN7Zr3BScmseJ4GRlS8/h3wUUl7SOoL/CewCrilos1XJG0laVPgG8Cl1XYsaTRwHfB/I+KPVZr8GvimpE0k7QB8ETh/fTtk1lVOSmbF8z2yxPCCpK9HxBLgs8CPgWXAx8kmQrxa8Z6Lgb8AD6Wf77Sz7/8kmyzxyzS7b4WkypHQt4EHyU4BzgZOi4jrerBvZh2SH/JnVt8kLQW+EBH/m3csZuvLIyUzMysMJyUzMysMn74zM7PC8EjJzMwKY4O8A6hnm2++eTQ1NeUdhplZXZk7d+6yiBhWbZuT0npoamqipaUl7zDMzOqKpEfa2+bTd2ZmVhhOSmZmVhhOSmZmVhhOSmZmVhhOSmZmVhhOSmZmVhhOSmZmVhhOSmZmVhi+eXY9LHxiOU0n/CnXGJae+tFcP9/MrCd5pGRmZoXhpGRmZoVR2qQk6RhJi9LP0ZKaJN0r6RxJiyX9RVL/vOM0MyuTUiYlSeOAw4EPALsAXwQ2AUYCP42IHYEXgE9Wee9USS2SWlavXN6LUZuZNb5SJiVgAnBlRLwcESuAK4CJwMMRMS+1mQs0tX1jRMyIiOaIaO4zYEivBWxmVgZlTUpqZ/2qiuXVeHaimVmvKmtSmgPsL2mApI2BA4Cbco7JzKz0SjkSiIi7JJ0P3JFWnQs83939vHfLIbT4PiEzsx5TyqQEEBFnAGe0WT26YvvpvRuRmZmVNin1hCJUdABXdTCzxtFr15QknSzp6zXYb5Okz/T0fs3MrPc1wkSHJsBJycysAdQ0KUk6SdISSf8LvDutGyPpNkkLJF0paRNJb5c0N23fSVJI2ia9fjDNkjtf0lmSbpH0kKQD08ecCkyUNE/S1yRtJOlXkhZKulvS7mk/f5b0vrR8t6RvpeVTJH1B0iRJsyRdJuk+SRdJam/quJmZ1UDNklKqmnAwMBb4D2DntOnXwPER8T5gIfDtiHgG2EjSYLKbWFvIEs22wDMRsTK9dzjZja8fI0tGACcAN0XEmIj4IfAVgIh4L3AIcIGkjcimgU9Mn/E6sGt6/wTWTAcfCxwNjAJGVLSp7JcrOpiZ1UgtR0oTyaomrIyIF4GrgY2BoRExO7W5ANgtLd9ClgR2A76bfk9k7fuHroqINyLiHuAd7XzuBOBCgIi4D3gE2D7tZ7e0/U/AQEkDgKaIWJLee0dEPB4RbwDzcEUHM7NeVevZd9GNtjeRJaFtgT8Ax6f3X1PRprLiQnun1tpbfyfQDDwEXA9sTlbzbm47+3dFBzOzXlbLkdIc4ABJ/SUNAj4OvAw8L2liavM5YHZF+88C96eRynPAvsDNnXzOS8CgNp87GUDS9sA2wJKIeBV4DPg0cBtZEvw6ruRgZlYYNRsJpKoJl5KdBnuENV/+hwFnp1NnD5FV6yYilqZ5BXNSu78BW0VEZ5UWFgCvS5oPnA/8LO1/Idm1oykR0ToCugnYIyJWSroJ2Ir1SEqu6GBm1rMU0Z0zbFapubk5Wlpa8g7DzKyuSJobEc3VtvmayXooSkUHcFUHM2sMjXDzrJmZNQgnpURSn7xjMDMru9IkJUlXSZorabGkqWndCkn/Lel2YLykcZJmp3YzJQ3POWwzs1Ip0zWlIyLiOUn9gTslXU52M++iiPiWpL5k09M/ERH/lHQQMB04onInKaFNBegzeFjv9sDMrMGVKSlNk3RAWt4aGEl2g+zlad27yZ6ndH2amt4HeLLtTiJiBjADoN/wkZ66aGbWg0qRlCRNAvYExqd7lGYBGwH/iojVrc2AxRExPp8ozcysLNeUhgDPp4S0A7BLlTZLgGGSxgNI6itpx94M0sys7EoxUgKuA74kaQFZ8rmtbYOIeDU9DuMsSUPI/jZnAovb26krOpiZ9axSJKVUZmifKpsGtmk3jzVVy83MrJeVIinVSpEqOrTlCg9mVo/Kck0JSUMlHZl3HGZm1r7SJCVgKNCtpKRMmf5GZma5qpsvXEmHSlogab6kCyUNk3S5pDvTz66p3cmSzpM0S9JDkqalXZwKbCdpnqTTUttj03sXSPqvtK5J0r2SfgbcRXZPk5mZ9YK6uKaUpmafBOwaEcskbQr8BPhhRPxN0jbATOA96S07ALuTPfxviaSfAycAoyNiTNrnXmQ30P4b2T1KV0vaDXiU7EbawyPiLSMrV3QwM6udukhKwIeAyyJiGUAqF7QnMCpVXwAYnJ5wC/CnNONulaRngHdU2ede6efu9HogWZJ6FHgkIt4ybTx9tis6mJnVSL0kJQFtE8DbyCo0vLJWwyxJrapYtZrq/RTwvYj4RZv3N5E9tt3MzHpZvVxTugH4tKTNANLpu78AX21tIGlMJ/t4iex0XquZwBGSBqb3bynp7T0atZmZdUtdjJQiYrGk6cBsSavJTrlNA36aqjRsAMwBvtTBPp6VdLOkRcC1EXGspPcAt6bR1Qrgs2Qjqy5xRQczs56lCF8WWVfNzc3R0tKSdxhmZnVF0tyIaK62rS5GSkVV5IoOrVzZwczqSb1cUzIzsxJwUuqAKzqYmfWu0n/hSjpG0qL0c7QrOpiZ5afU15QkjQMOBz5Adt/S7cBsXNHBzCwXZR8pTQCujIiXI2IFcAUwkU4qOkREc0Q09xkwpDdjNTNreGVPSmpnvSs6mJnloOxJaQ6wv6QBkjYGDgBuyjkmM7PSKvU1pYi4S9L5wB1p1bnA8119vys6mJn1rFInJYCIOAM4o83q0XnEYmZWdqVPSuujHio6dMTVHsysaMp+TcnMzArESakDkvrkHYOZWZk0TFKSdIqkoypeT5d0lKTTUrWGhZIOStsmSbqmou1PJE1Jy0slfUvS34BP9XY/zMzKrGGSEvBL4DCAVK/uYOBxYAywE7AncJqk4V3Y178iYkJE/LbtBklTJbVIalm9cnnPRW9mZo2TlCJiKfCspLHAXmQPApwAXBIRqyPiabISQjt3YXeXdvA5ruhgZlYjjTb77lxgCvBO4Dyy5FTN66ydkDdqs90VHczMctAwI6XkSmBvstHQTLKKDQdJ6iNpGLAb2Y2yjwCjJPWTNATYI6+AzcxsjYYaKUXEq5JuBF6IiNWSrgTGA/OBAI6LiKcAJP0OWADcT3aqr9tc0cHMrGcpIvKOocekCQ53AZ+KiPtr/XnNzc3R0tJS648xM2sokuZGRHO1bQ0zUpI0CriG7FEUNU9IUP8VHTriag9mloeGSUoRcQ8worN2kqYBXwbuiojJNQ/MzMy6rGGSUjccCewTEQ931lDSBhHxei/EZGZmlCwpSTqbbDR1dXpkxcT0eiUwNSIWSDoZ2AJoApYBn8klWDOzEmq0KeEdiogvAf8AdidLOndHxPuAbwC/rmg6DvhERLwlIbmig5lZ7ZQqKbUxAbgQICL+CmyW7lkCuDoiXqn2Jld0MDOrnTInJVVZ1zo/3hUdzMxyUOakNAeYDFnVcGBZRLyYa0RmZiVXqokObZwM/ErSArKJDod1dweu6GBm1rNKl5Qioqni5SeqbD+514IxM7O1lC4p9aRGrujQypUdzKw3Ncw1JUnTJN0r6aK8YzEzs3XTSCOlt1RqcEUGM7P60hBJqU2lhm3InhzbBCyTdATwc6CZ7OF+x0TEjZKmAPsDfYDRwP8AGwKfA1YB+0bEc73cFTOzUmuI03dtKjX8kLUrMnwltXkvcAhwgaTWJ82OJisj9G/AdGBlRIwFbgUOrfZZruhgZlY7DZGUqqisyFBZueE+sqfObp+23RgRL0XEP4HlwB/T+oVkI623cEUHM7PaadSkVFmRoVrlhlarKpbfqHj9Bg1yatPMrJ40alKqVFm5YXtgG2BJrhGZmVlVZRgN/Aw4W9JCsokOUyJildTRAKprXNHBzKxnKSI6b2VVNTc3R0tLS95hmJnVFUlzI6K52rYyjJRqpgwVHVq5soOZ9YYyXFPqNknfyDsGM7MyclKqzknJzCwHpT99J+kqYGtgI+BHZJUh+kuaByyOiMl5xmdmVialT0rAERHxnKT+wJ3AvwNfjYgx1RpLmgpMBegzeFjvRWlmVgI+fQfTJM0HbiMbMY3sqLErOpiZ1U6pR0rpMeh7AuMjYqWkWWSn8czMLAddGilJ2ljS29Ly9pL2k9S3tqH1iiHA8ykh7QDskta/1iD9MzOrK126eVbSXGAisAnZaa4WsoradT0JQFI/4CpgS7LSQ8OAk4F9gP2Auzrqo2+eNTPrvp64eVZpNPF54McR8QNJd/dciPmIiFVkCaitWcDxvRuNmZl1OSlJGk9W2PTz3XxvwypTRYeucuUHM1sfXZ19dzRwInBlRCyWNAK4sXZh9S5Jt+Qdg5mZdXG0ExGzgdkVrx8CptUqqN4WER/MOwYzM+skKUn6I9DuTIiI2K/HI8qBpBURMVDScOBSYDDZ3+bLEXFTvtGZmZVHZyOl09Pv/wDeCfwmvT4EWFqjmPL0GWBmREyX1AcY0LaBKzqYmdVOh0kpnbZD0ikRsVvFpj9KmlPTyPJxJ3BeukfpqoiY17ZBRMwAZgD0Gz7SD6MyM+tBXZ3oMCxNbgBA0rvI7ulpKBExB9gNeAK4UNKhOYdkZlYqXZ3W/TVglqSH0usm0imsRiJpW+CJiDhH0sbA+4Ff5xyWmVlpdJqUUnmhF8kKle6QVt+XbjxtNJOAYyW9BqwAOhwpvXfLIbT4vhwzsx7TaVKKiDck/U9EjAfm90JMvS4iBqbfFwAX5ByOmVlpdfX03V8kfRK4IrpSLK8kXNGhfa7sYGbroqsTHY4Bfg+8KulFSS9JerGGcdWMpD9LGpp+jqxYP0nSNXnGZmZWdl1KShExKCLeFhF9I2Jwej241sHVQkTsGxEvAEOBIztrb2ZmvafLT55Nz1A6Pf18rJZBrQ9Jx0malpZ/KOmvaXkPSb+RtFTS5sCpwHaS5kk6Lb19oKTLJN0n6SJJyqkbZmal1NWH/J0KHAXck36OSuuKaA7Zs58AmskSTV9gAlBZMugE4MGIGBMRx6Z1Y8mKz44CRgC7tt25pKmSWiS1rF65vFZ9MDMrpa6OlPYFPhwR50XEecDeaV0RzQXGSRoErAJuJUtOE1k7KVVzR0Q8HhFvAPPI7sdaS0TMiIjmiGjuM2BIz0ZuZlZyXT59R3YNplVhv40j4jWyunyHA7eQJaLdge2Aezt5e+W9V6vxM6PMzHpVV790vwvcJWkWILJSPCfWKqgeMAf4OnAEsBA4A5gbEVFxmeglYFA+4ZmZWTVdTUofBc4DngceBY6PiKdqFtX6uwk4Cbg1Il6W9C/anLqLiGcl3SxpEXAt0O0bjlzRwcysZ6kr98JK+hDZRIGJZBMA5gFzIuJHtQ2v2Jqbm6OlpSXvMMzM6oqkuRHRXHVbVws0pOcL7Ux2feZLwCsRsUPH72ps/YaPjOGHnZl3GHXBFR7MrFVHSamrU8JvAG4GDgKWADsXJSFJakqn4LrafoqkLbrQ7nxJB65fdGZm1h1dnX23AHgVGA28DxgtqX/NoqqtKUCnScnMzHpfV8sMfS09efYA4FngV8ALtQysm/pIOkfSYkl/kdRf0hhJt0laIOlKSZukkU8zcFGq5NBf0jhJsyXNlTRT0vC8O2NmVlZdPX33VUmXkk1w2J9sJt4+tQysm0YCP42IHcmS5SfJHs53fES8j2xa+Lcj4jKgBZgcEWOA14EfAwdGxDiyfk3v6INc0cHMrHa6OiW8P2vu9Xm9hvGsq4cjYl5ankt2o+zQiJid1l1AVuW8rXeTnZK8Pt2/1Ad4sqMPiogZwAzIJjqsf+hmZtaqS0kpIk7rvFWu2lZiGNpewzYELE4PMDQzs5x1p8xQPVkOPC+ptTDr54DWUVNlJYclwDBJ4wEk9ZW0Y69GamZmb2rk2m6HAWdLGgA8RFYLD+D8tP4VYDxwIHCWpCFkf48zgcVd+QBXdDAz61ldvnnW3soVHczMuq+jm2cbeaRUcwufWE7TCd0umVd6ru5gZu1p1GtKZmZWh5yUzMysMBo2KUk6RdJRFa+nSzpK0mmSFklaKOmgtG2SpGsq2v5E0pQcwjYzK7WGTUrAL8lm4CHpbcDBwOPAGGAnYE/gtO6WFXJFBzOz2mnYpBQRS4FnJY0F9gLuJnsm1CURsToinia7d2nnbu53RkQ0R0RznwGFfSq8mVldavTZd+eSVQV/J1ldu73aafc6ayfojWoblpmZVdOwI6XkSmBvstHQTGAOcJCkPpKGAbsBdwCPAKMk9Us30e6RV8BmZmXW0COliHhV0o3ACxGxWtKVZFUc5gMBHBcRTwFI+h3Zc6PuJzvV1ylXdDAz61kNXdEhTXC4C/hURNzf0/t3RQczs+4rZUUHSaOAa4Ara5GQwBUdepOrQJiVQ8MmpYi4BxiRdxxmZtZ1jT7RYb1IatikbWZWRKVISpKOSVUcFkk6WlKTpEUV278u6eS0PEvSdyXNBo5qb59mZtbzGn4kIGkc2bOUPkD2pNnbWfPAv/YMjYh/b2d/U4GpAH0GD+vBSM3MrAwjpQlkkx1ejogVwBXAxE7ec2l7G1zRwcysdsqQlFRl3VA6ruDwcu3CMTOz9pQhKc0B9pc0QNLGwAHAtcDbJW0mqR/wsVwjNDMzoATXlCLiLknnk5UTAjg3Iu6U9N9k15ceBu5bl327ooOZWc9q6IoOteaKDmZm3VfKig69wRUdGoOrRZgVRxmuKXVI0i15x2BmZpnSJ6WI+GDeMZiZWab0SUnSivR7UqrmcJmk+yRdJKnadHIzM6uR0ielNsYCRwOjyIq57tq2gaSpkloktaxeuby34zMza2hOSmu7IyIej4g3gHlAU9sGruhgZlY7TkprW1WxvBrPTjQz61VOSmZmVhgeCawHV3QwM+tZpU9KETEw/Z4FzKpY/9WcQjIzK63SJ6X14YoOjc2VHsx6X6mvKUkaKunIvOMwM7NMqZMS2XOVnJTMzAqi7KfvTgW2kzQPuD6t2wcI4DsR0e4TaM3MrOeVfaR0AvBgRIwBbgPGADsBewKnSRre9g2u6GBmVjtlT0qVJgCXRMTqiHgamA3s3LaRKzqYmdWOk9IaLr5qZpazsiell4BBaXkOcJCkPpKGAbux5hHqZmbWC0o90SEinpV0s6RFwLXAAmA+2USH4yLiqY7e74oOZmY9q9RJCSAiPtNm1bG5BGJmZk5K68MVHawjrghh1n0Nf01J0v6SRlW8niJpi4rXsyQ15xOdmZlVavikBOxP9iTZVlOALao3NTOzPBU6KUnaWNKfJM2XtEjSQZLGSZotaa6kma03uEr6oqQ7U9vLJQ2Q9EFgP7IbYedJOh5oBi5Kr/u3+by9JN0q6S5Jv5c0sPd7bWZWXoVOSsDewD8iYqeIGA1cB/wYODAixgHnAdNT2ysiYueI2Am4F/h8RNwCXA0cGxFjIuL7QAswOb1+pfWDJG0OfBPYMyLen9od0zYgV3QwM6udok90WAicLun7wDXA88Bo4HpJAH2AJ1Pb0ZK+Q1ZkdSAws5uftQvZab6b0743BG5t2ygiZgAzAPoNHxnd/AwzM+tAoZNSRPxd0jhgX+B7ZEVTF0fE+CrNzwf2j4j5kqYAk7r5cQKuj4hD1j1iMzNbH4U+fZdmya2MiN8ApwMfAIZJGp+295W0Y2o+CHhSUl9gcsVuKqs2VHvd6jZgV0n/J+17gKTte7RDZmbWoUKPlID3kk1SeAN4Dfgy8DpwlqQhZPGfCSwG/h9wO/AI2Wm/1sTzW+AcSdOAA8lGVGdLegV4c8QVEf9MI6xLJPVLq78J/L3d4FzRwcysRynCl0XWVXNzc7S0tOQdhplZXZE0NyKq3h9a9JFSobmig3XEFR3Muq/Q15TyImmapHslXZR3LGZmZeKRUnVHAvtExMN5B2JmVialHylJOiZVi1gk6WhJZwMjgKslfS3v+MzMyqTUI6V0D9ThZFPNRTZ777NklSR2j4hlVd4zFZgK0GfwsN4L1sysBMo+UpoAXBkRL0fECuAKYGJHb4iIGRHRHBHNfQYM6ZUgzczKouxJSXkHYGZma5Q9Kc0B9k/VGzYGDgBuyjkmM7PSKvU1pYi4S9L5wB1p1bkRcXcqyNopV3QwM+tZpU5KABFxBnBGm3VN+URjZlZupU9K68MVHcysu1zpo2Nlv6bULkkr8o7BzKxsnJTMzKwwGjopSbpK0lxJi9NNr0haIWm6pPmSbpP0jrT+XZJulXSnpFPyjdzMrJwaOikBR0TEOKAZmCZpM2Bj4LaI2IlsSvgXU9sfAT+PiJ2Bp9rboaSpkloktaxeubzG4ZuZlUujJ6VpkuaTPVV2a2Ak8CpwTdo+F2hKy7sCl6TlC9vboSs6mJnVTsPOvpM0CdgTGB8RKyXNAjYCXos1TzZczdp/Az/x0MwsR408UhoCPJ8S0g7ALp20vxk4OC1PrmlkZmZWVcOOlIDrgC9JWgAsITuF15GjgIslHQVc3pUPcEUHM7OepTVnsqy7mpubo6WlJe8wzMzqiqS5EdFcbVsjj5RqzhUdzKyMalmVopGvKQEgaaikI9PyJEnXdPYeMzPLR8MnJWAocGTeQZiZWefKcPruVGA7SfOA14CXJV0GjCa7T+mzERHp0ehnAAOBZcCUiHgyr6DNzMqoDCOlE4AHI2IMcCwwFjgaGAWMAHaV1Bf4MXBgqgBxHjC92s5c0cHMrHbKMFJq646IeBwgjZ6agBfIRk7Xpwf89QGqjpIiYgYwA6Df8JGeumhm1oPKmJRWVSy3VnQQsDgixucTkpmZQTlO370EDOqkzRJgmKTxAJL6Stqx5pGZmdlaGn6kFBHPSrpZ0iLgFeDpKm1elXQgcJakIWR/lzOBxR3t2xUdzMx6VsMnJYCI+Ew7679asTwP2K3XgjIzs7cow+k7MzOrE05KZmZWGE5KZmZWGE5KZmZWGE5KZmZWGE5KZmZWGE5KZmZWGE5KZmZWGH4c+nqQ9BJZiaJGsDnZIzsagftSPI3SD3BfesK2ETGs2oZSVHSooSXtPWe+3khqcV+Kp1H60ij9APel1nz6zszMCsNJyczMCsNJaf3MyDuAHuS+FFOj9KVR+gHuS015ooOZmRWGR0pmZlYYTkpmZlYYTkrrSNLekpZIekDSCXnH0xFJW0u6UdK9khZLOiqt31TS9ZLuT783Sesl6azUtwWS3p9vD95KUh9Jd0u6Jr1+l6TbU18ulbRhWt8vvX4gbW/KM+62JA2VdJmk+9LxGV+Px0XS19J/W4skXSJpo3o5JpLOk/RMejp167puHwNJh6X290s6rEB9OS3997VA0pWShlZsOzH1ZYmkj1Ssz+/7LSL8080foA/wIDAC2BCYD4zKO64O4h0OvD8tDwL+DowCfgCckNafAHw/Le8LXAsI2AW4Pe8+VOnTMcDFwDXp9e+Ag9Py2cCX0/KRwNlp+WDg0rxjb9OPC4AvpOUNgaH1dlyALYGHgf4Vx2JKvRwTsidOvx9YVLGuW8cA2BR4KP3eJC1vUpC+7AVskJa/X9GXUem7qx/wrvSd1ifv77fc/4Ouxx9gPDCz4vWJwIl5x9WN+P8AfJisGsXwtG442c3AAL8ADqlo/2a7IvwAWwE3AB8CrklfEMsq/uG9eXyAmcD4tLxBaqe8+5DiGZy+zNVmfV0dl5SUHktfyBukY/KRejomQFObL/JuHQPgEOAXFevXapdnX9psOwC4KC2v9b3Velzy/n7z6bt10/qPsNXjaV3hpVMlY4HbgXdExJMA6ffbU7Oi9+9M4DjgjfR6M+CFiHg9va6M982+pO3LU/siGAH8E/hVOhV5rqSNqbPjEhFPAKcDjwJPkv2N51Kfx6RVd49BIY9NFUeQjfSgoH1xUlo3qrKu8HPrJQ0ELgeOjogXO2paZV0h+ifpY8AzETG3cnWVptGFbXnbgOxUy88jYizwMtmpovYUsi/pessnyE4BbQFsDOxTpWk9HJPOtBd74fsk6STgdeCi1lVVmuXeFyeldfM4sHXF662Af+QUS5dI6kuWkC6KiCvS6qclDU/bhwPPpPVF7t+uwH6SlgK/JTuFdyYwVFJrLcfKeN/sS9o+BHiuNwPuwOPA4xFxe3p9GVmSqrfjsifwcET8MyJeA64APkh9HpNW3T0GRT02QDYJA/gYMDnSOTkK2hcnpXVzJzAyzS7akOxi7dU5x9QuSQJ+CdwbEWdUbLoaaJ0ldBjZtabW9YemmUa7AMtbT2XkLSJOjIitIqKJ7O/+14iYDNwIHJiate1Lax8PTO0L8X+wEQgCwhYAAAJ1SURBVPEU8Jikd6dVewD3UH/H5VFgF0kD0n9rrf2ou2NSobvHYCawl6RN0shxr7Qud5L2Bo4H9ouIlRWbrgYOTrMh3wWMBO4g7++3PC7ENcIP2Sycv5PNUjkp73g6iXUC2fB7ATAv/exLdh7/BuD+9HvT1F7AT1PfFgLNefehnX5NYs3suxFk/6AeAH4P9EvrN0qvH0jbR+Qdd5s+jAFa0rG5imzmVt0dF+C/gPuARcCFZDO66uKYAJeQXQt7jWyU8Pl1OQZk12seSD+HF6gvD5BdI2r9t392RfuTUl+WAPtUrM/t+81lhszMrDB8+s7MzArDScnMzArDScnMzArDScnMzArDScnMzArDScms5CQdLWlA3nGYgZ88a1Z6qTpGc0QsyzsWM4+UzOqApEPT83DmS7pQ0raSbkjrbpC0TWp3vqQDK963Iv2eJGmW1jy76aJUlWAaWb26GyXdmE/vzNbYoPMmZpYnSTuS3Xm/a0Qsk7Qp2XOYfh0RF0g6AjgL2L+TXY0FdiSrY3Zz2t9Zko4BdvdIyYrAIyWz4vsQcFlr0oiI58ieeXNx2n4hWSmpztwREY9HxBtk5WaaahCr2XpxUjIrPtH5owNat79O+nediqNuWNFmVcXyanymxArIScms+G4APi1pM4B0+u4WsurNAJOBv6XlpcC4tPwJoG8X9v8SMKingjVbH/4/JbOCi4jFkqYDsyWtBu4GpgHnSTqW7Om1h6fm5wB/kHQHWTJ7uQsfMQO4VtKTEbF7z/fArOs8JdzMzArDp+/MzKwwnJTMzKwwnJTMzKwwnJTMzKwwnJTMzKwwnJTMzKwwnJTMzKww/j/5MpZU5GEvRAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.barh(df_common_words['desc'], df_common_words['count'])\n",
    "plt.xlabel('count')\n",
    "plt.ylabel('words')\n",
    "plt.title('top 20')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到top20大多数是无关紧要的词"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_top_n_words(corpus, n=None):\n",
    "    # 获取某数据中最长出现的n个词，并增加停用词过滤\n",
    "    vec = CountVectorizer(stop_words='english').fit(corpus)  # 增加停用词，即自动过滤掉某些字或词\n",
    "    bag_of_words = vec.transform(corpus)\n",
    "    sum_words = bag_of_words.sum(axis=0)\n",
    "    words_freq = [(word, sum_words[0,idx]) for word,idx in vec.vocabulary_.items()]  # 得到词及对应出现的次数\n",
    "    words_freq = sorted(words_freq, key=lambda x:x[1],reverse=True)\n",
    "    return words_freq[:n]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>desc</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>seattle</td>\n",
       "      <td>533</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>hotel</td>\n",
       "      <td>295</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>center</td>\n",
       "      <td>151</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>downtown</td>\n",
       "      <td>133</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>free</td>\n",
       "      <td>123</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       desc  count\n",
       "0   seattle    533\n",
       "1     hotel    295\n",
       "2    center    151\n",
       "3  downtown    133\n",
       "4      free    123"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "common_words = get_top_n_words(df['desc'], 20)\n",
    "df_common_words = pd.DataFrame(common_words, columns=['desc', 'count'])\n",
    "df_common_words.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 1.0, 'top 20')"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaUAAAEWCAYAAADGjIh1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3debyd073H8c9XECIkNGlfMZ7SGGIKOVRMRVGldaPVqiqJKVerDVVjq8pFUW4nrSF6SQzFNV0xlGgqQQg5mRNDFVFTi4oQUUPyu388a7OznWGfnLPn7/v12q/97PWs59lrHcf5ZT3Pen5LEYGZmVk1WKHSDTAzM8txUDIzs6rhoGRmZlXDQcnMzKqGg5KZmVUNByUzM6saDkpmZlY1HJTMqoyk+ZL2LNG595P0kKQ3Jf1D0hWSVs/b31PSlZLeSvtPKEU7zNrioGTWWPoA5wBrA5sB6wIX5u0/ExgIbADsDpwsaZ8yt9EamIOSWRWRdA2wPnCHpEWSTk7l+0ual0Y4EyVtlnfMfEmnSXpc0gJJV0lapbXzR8QfI+KeiFgcEQuAK4Cd8qocBpwdEQsi4om0f0SJumv2CQ5KZlUkIg4F/g58NSJ6R8QvJG0MXA8cD/QH7iYLWivnHXoI8CVgI2Bj4PQiv3JXYB6ApDXJRlCz8vbPAjZf/h6ZdY6Dkln1Owi4KyLui4gPgIuAVYEd8+r8LiJeiIg3gHOBgzs6qaS9gOHAGamod3pfmFdtIbA6ZmXioGRW/dYGns99iIilwAvAOnl1Xsjbfj4d0yZJOwB/BA6MiL+m4kXpfY28qmsAby9fs806z0HJrPoUpu5/mWziAQCSBKwHvJRXZ7287fXTMa2StA0wDjgiIiZ89KXZPaZXgK3zqm9NurxnVg4OSmbV55/Ahnmf/xfYT9IXJa0E/Ah4D3g4r86xktaVtBbwY+DG1k4saQvgHuAHEXFHK1WuBk6XtKakTYGjgTFd7ZBZsRyUzKrPeWSB4U1JJ0bEU8B3gIuB14Gvkk2EeD/vmD8C44Fn0+ucNs79I7LJEv+TZvctkpQ/EvoZ8AzZJcBJwIURcU839s2sXfIif2a1TdJ84KiI+HOl22LWVR4pmZlZ1XBQMjOzquHLd2ZmVjU8UjIzs6qxYqUbUMv69esXTU1NlW6GmVlNmTZt2usR0b+1fQ5KXdDU1ERLS0ulm2FmVlMkPd/WPl++MzOzquGgZGZmVcNByczMqoaDkpmZVQ0HJTMzqxoOSmZmVjUclMzMrGo4KJmZWdXww7NdMOelhTSdelelm9Gu+efvV+kmmJkVrWwjJUlNkuaW4LxnSjqxlfJNJc2UNEPSRp085whJa3dfK83MrBhVdflOUo9uPN0w4PaI2CYinunksSMAByUzszIrd1BaUdJYSbMl3Sypl6T5ks6Q9BDwDUkbSbpH0jRJD0raFEDSVyU9mkY+f5b0mcKTSzpa0p8k7QccDxwl6f607//SOedJGpnKekgaI2mupDmSfijpQKAZuC6NtFYt20/HzKzBlfue0ibAkRExWdKVwPdS+b8jYmcASROAYyLiaUmfBy4B9gAeAnaIiJB0FHAy8KPciSV9H9gbGBYR70m6DFgUERelKkdExBspyEyVdAvQBKwTEVukc/SNiDfTuU6MiE9kW00BbSRAjzVaTXJrZmbLqdxB6YWImJy2rwVGpe0bAST1BnYEbpKUO6Znel8XuFHSAGBl4Lm88x4KvEgWkD5o47tHSTogba8HDASeAjaUdDFwFzC+ow5ExGhgNEDPAQO9QqKZWTcq9+W7wj/iuc/vpPcVgDcjYnDea7O072LgdxGxJfCfwCp555lLNupZt7UvlbQbsCcwNCK2BmYAq0TEAmBrYCJwLPCH5e+amZl1VbmD0vqShqbtg8kuyX0kIt4CnpP0DQBltk67+wAvpe3hBeedQRaoxrUxa64PsCAiFqd7VDuk8/cDVoiIW4CfAtum+m8Dqy9nH83MbDmV+/LdE8BwSZcDTwOXAj8oqHMIcKmk04GVgBuAWcCZZJf1XgKmAJ/NPygiHkpTw++StFfBOe8BjpE0m+yS3ZRUvg5wlaRccD4tvY8BLpP0Ltno6t3WOrPlOn1o8XNAZmbdRhG+LbK8mpubwyvPmpl1jqRpEdHc2j5ndOiCWsjo0BZnejCzalRVD8+amVljq6uglCZG1FWfzMwaSc3/AU859Z6QdAkwHTg0ZWeYK+mCvHoHt1G+SNIFKdvDnyVtL2mipGcl7V+JPpmZNaqaD0rJJsDVwH7A2WQZIAYD20kalqaJX1BYno5dDZgYEUPIpoKfA+wFHAD8V+EXSRopqUVSy5LFC0vcLTOzxlIvQen5iJgCbEcWYF6LiA+B64Bd2ykHeJ9syjjAHGBSygoxh+yB3GVExOiIaI6I5h69+pS0U2ZmjaZeglIuI4Ta2N9WOcAH8fG8+KXAewARsRTPTjQzK6t6CUo5jwJfkNQvLYNxMDCpnXIzM6sidTUSiIhXJJ0G3E82Oro7Im4HaKu8K5zRwcysezmjQxc4o4OZWec5o0OJ1HJGB3BWBzOrPvV2T6lDko6RdFjaHtFGVnEzM6uAhhspRcRleR9HkK3F9HJlWmNmZvnqPiilUdGJZAsKzgaeARYB84Fm4Lq0RMVPgKMi4oB03F7AdyPia5Vot5lZI6rry3eSNicLNnukFWePy+2LiJuBFuCQiBgM3A1sJql/qnI4cFUr53RGBzOzEqnroESWVujmiHgdICLeaKtieoD2GuA7kvoCQ4E/tVLPGR3MzEqk3i/fieyyXbGuAu4A/g3clFISmZlZmdT7SGkC8E1JnwKQtFbB/reB1XMfIuJlskkPp5MtiW5mZmVU1yOliJgn6VxgkqQlwAyyCQ45Y4DL0kSHoRHxLlmy1v4R8XhH53dGBzOz7lXXQQkgIsYCY9vYdwtwS0HxzsAVpW6XmZl9Ut0Hpc6QNI0s4/iPiqlf6xkdcpzZwcyqRb3fU0LSw8XWjYghEbEr8GVJg0rYLDMza0XdB6WI2HE5DhsGOCiZmZVZ3QclSYsk7Sbpzryy30kakbbPl/S4pNmSLpK0I7A/cKGkmZI2qlDTzcwaTkPfU0pTxA8ANo2IkNQ3It6UNA64M2V9KDxmJDASoMca/Qt3m5lZF9T9SKkDb5E9KPsHSV8DFnd0gDM6mJmVTqMEpQ9Ztq+rAKSMDduTTQsfBtxT/qaZmVlOo1y+ex4YJKknWUD6IvCQpN5Ar4i4W9IU4G+p/jKZHszMrDwaIShFRLwg6X/Jlq54miyzA2SB53ZJq5DlyfthKr8BuELSKODAiHimtRM7o4OZWfeq66CUct69ARARJwMnt1Jt+8KCiJiMp4SbmZVd3QaltMz5ROCiUn1HvWR0yHFmBzOrtIpNdJDUJGluF8+xtqRPTNuGLON3RGwcERd35TvMzKx8anqklJaaOLDS7TAzs+5R6SnhK0oam7Ip3Cypl6T5kvoBSGqWNDFtfyFlWJgpaYak1fNHW5JGSLpV0j2Snpb0i9yXSNpb0iOSpku6Kc26+0Q2h1T2DUlzJc2S9EDZfyJmZg2s0iOlTYAjI2KypCuB77VT90Tg2FS3N9lDr4UGA9sA7wFPSboYeJds0b49I+IdSacAJ0j6HQXZHNI5zgC+FBEv5ZV9xBkdzMxKp9IjpRfSTDeAa8nWMmrLZOCXaZp23zaWKp8QEQsj4t/A48AGwA5kM+kmS5oJDE/lbWVzmAyMkXQ00KPwC5zRwcysdCodlKKVz/nZF1b5aEfE+cBRwKrAFEmbtnK+9/K2l5CNBAXcFxGD02tQRBzZVjaHiDiGbGS1HjAzt5S6mZmVXqWD0vqShqbtg4GHyJYrH5LKvp6rKGmjiJgTERcALUBrQak1U4CdJH0unaeXpI3TJcA+EXE3cDzZpb/c9zwaEWcAr5MFJzMzK4NK31N6Ahgu6XKyTAuXAo8B/yPpx8CjeXWPl7Q72QjoceBPwICOviAiXkvLVFyf0gxBNhJ6m9azOVwoaWAqmwDMauvczuhgZta9FFF4Bc2K1dzcHC0tLZVuhplZTZE0LSKaW9tX6ZFSTau3jA45zuxgZpVS6XtKZSPp4Uq3wczM2tcwQSkidqx0G8zMrH0NE5QkLUrvu0mamDJIPCnpOklK++ZLOitlfpjTxrRzMzMrkYYJSgW2IZsGPgjYENgpb9/rEbEt2UzAEwsPlDRSUoukliWLF5alsWZmjaJRg9JjEfFiRCwFZgJNeftuTe/TCsoBZ3QwMyulRg1KrWV+KNxXWG5mZiXWqEHJzMyqkEcCXeCMDmZm3athglJE9E7vE8mWSc+Vfz9vuylvuwXYrVztMzOzBgpKpeCMDmZm3ath7ilJ+oOkQWl7UaXbY2Zmn9QwI6WIOKrSbTAzs/bV3UhJUlPK1DBW0uyUuaFXyuLQXFC3n6RHJO2XPp8kaWo67qzK9MDMrHHVXVBKNgFGR8RWZMuef6+wgqTPAHcBZ0TEXZL2BgaSrUY7GBgiaddWjnNGBzOzEqnXoPRCRExO29cCOxfsX4lsAb+TI+K+VLZ3es0AppOtbDuw8MTO6GBmVjr1ek+pcOXCws8fkqUR+hIwKZUJOC8iLi9x28zMrA31OlJaX9LQtH0w8FDB/gCOADaVdGoquxc4QlJvAEnrSPp0WVprZmZA/Y6UngCGS7oceJos4/dX8ytExBJJ3wLukPRWRFwiaTPgkbSSxSLgO8CrbX2JMzqYmXWveg1KSyPimIKy3XIbedkd3ie7hJcr/w3wm3I00MzMPqleg1JZ1GtGhxxndjCzcqu7e0oRMT8itigsl3SMpMMq0SYzMytOw4yUIuKySrfBzMzaV9MjJUnfkfSYpJmSLpfUQ9IiSedKmiVpSnpIFklnSjoxbQ9O+2ZLuk3SmpI2kjQ979wDJU2rVN/MzBpRzQalNFPuIGCniBhMtlLsIcBqwJSI2Bp4ADi6lcOvBk5JGR/mAD+LiGeAhZIGpzqHA2Na+V5ndDAzK5GaDUrAF4EhwFRJM9PnDYH3gTtTnWlAU/5BkvoAfSMi99DsWCCXTugPwOGSepAFvD8WfqkzOpiZlU4tByUBYyNicHptEhFnAh9ERC6DwxI6d9/sFuDLwFeAaRHxr25tsZmZtauWg9IE4MBc1gVJa0naoKODImIhsEDSLqnoUFKqoYj4N1lmh0uBq0rSajMza1PNzr6LiMclnQ6Ml7QC8AFwbEeHpffhwGWSegHPkt0/yrkO+BowvqM2OKODmVn3qtmgBBARNwI3FhT3ztt/M3Bz+vgp4PlUPhPYoY3T7gxcGRFLure1ZmbWkZoOSsWSdDbweeDMDurdBmwE7FHMees9o0OOMzuYWbnU8j2lzpgAvNrRxIWIOCBNE19X0r7laZqZmeXUfVCStDyjwcGAg5KZWZlVbVCS1CTpSUl/kDRX0nWS9pQ0WdLTkrZPr4clzUjvm6RjR0i6SdIdFExYkLRdqr+hpNUkXSlpair7D0krA/8FHJQyRRxUge6bmTWkar+n9DngG8BIYCrwbbKJCPsDPwYOA3aNiA8l7Qn8HPh6OnYosFVEvCFpNwBJOwIXA/8REX+X9HPgLxFxhKS+wGPAn4EzgOaI+H5hgySNTO2hxxr9S9NrM7MGVe1B6bmImAMgaR4wISJC0hyyTA19gLGSBpJN914p79j7IuKNvM+bAaOBvSPi5VS2N7B/LicesAqwfnsNiojR6Tz0HDCwcJl1MzPrgmoPSu/lbS/N+7yUrO1nA/dHxAGSmoCJefXfKTjXK2RBZxsgF5QEfD0insqvKOnz3dB2MzPrpKLuKaV7Lyuk7Y0l7S9ppY6OK4M+wEtpe0QHdd8E9gN+nrucR5a94QdK659L2iaVvw2s3q0tNTOzDhU7UnoA2EXSmmTTq1vIEpYeUqqGFekXZJfvTgD+0lHliPinpK8Cf5J0BNlI69fA7BSY5pPlvbsfODUlej0vPaT7Cc7oYGbWvfRx7tJ2KknTI2JbST8AVo2IX0iaERHbdHhwHWtubo6WlpZKN8PMrKZImhYRza3tK3akJElDyUZGR3by2LrljA5mZt2r2OeUjgdOA26LiHmSNiS7xGVmZtZtihrtpAXxJuV9fhYYVapGmZlZY2o3KKWMCG3edIqI/bu9RR2QtBrwv8C6QA+yyQoXkGUL3z1V+3ZE/C1NajgdWBn4F3BImuzQm+wh2may/p0VEbdI2hs4C+gJPAMcHhGLytc7M7PG1tHlu4uA/waeA94FrkivRcDc0jatTfsAL0fE1hGxBXBPKn8rIrYHfkc2ow7gIWCHNCHjBuDkVP5TYGFEbJkSsP5FUj+yALZnRGxLNsPwhMIvlzRSUoukliWLF5aqj2ZmDandkVK6bIeksyNi17xdd0h6oKQta9sc4CJJFwB3RsSD6TGj69P+64Ffpe11gRslDSAbLT2XyvcEvpU7YUQskPQVYBAwOZ1vZeCRwi93Rgczs9IpdgZdf0kbpntJSPosUJHEbxHxV0lDyLJ4nycpl3A1P0Dkti8GfhkR49IDs2emcvHJy5IiS010cEkabmZmHSp29t0PgYmSJkqaSDbz7riStaodktYGFkfEtWSXF7dNuw7Ke8+NcPIzPgzPO8144KNkq+mh4CnATpI+l8p6Sdq4JJ0wM7NWdThSSumF3gIGApum4icj4r22jyqpLYELJS0FPgC+S7bkeU9Jj5IF2txo50zgJkkvkQWdz6byc4DfS5oLLCGb6HCrpBHA9ZJ6pnqnA39tsyHO6GBm1q2KzejwSEQMLUN7louk+WRLTbxezu91Rgczs87rjowO4yV9Hbg1ioliDaJRMjrkOLODmZVasfeUTgBuAt6X9JaktyW9VcJ2dUjS3WlhPiKiqTtHSZKGSRrUXeczM7PiFJvRoeqWcYiIfQvLUqZvRcTS5T2vpBWBYcCdwOPL30IzM+usYkdKpDWULkqvr5SyUa189/9JmiZpXlqOHEnzJfWT1CTpCUmXANOB9SQtkvTfkqZLmiCpfzpmsKQpkmZLui3NuiPNKvy5pEnAKWTLrV8oaaakjcrZVzOzRlbsIn/nk00Bfzy9jktl5XJERAwhSws0StKnCvZvAlwdEdtExPPAasD0lJlhEvCzVO9q4JSUxWFOXjlA34j4QkScC4wDToqIwRHxTP4XOaODmVnpFDvRYV9gcO6ymKSxwAzg1FI1rMAoSQek7fXIpqfnez4ipuR9XkqWCw/gWuBWSX3IAk8usexYsvtkOa0u5FfIGR3MzEqnM2si9QXeSNt9StCWVqVMDHsCQyNicXp4d5WCau90cJpigkdH5zAzsxIr9p7Sz4HpksakUdK0VFYOfYAFKSBtCuxQxDErAAem7W8DD0XEQmCBpF1S+aHkLcdR4G2g6iZ3mJnVu2JHSvsBVwILgL+T3Zf5R8latax7gGMkzQaeIsvM0JF3gM0lTQMW8nEKouHAZZJ6Ac8Ch7dx/A3AFZJGAQcW3lfKcUYHM7PuVWxGhz2AnYFdgA2BmcADEfGb0jZv+UhaFBG9S/09zuhgZtZ57WV0KCoopZP0ALYjW0jvGODdiNi0/aMqo1xBqeeAgTFg+K87rlhHnNXBzLqqy2mGJE0gm2b9CPAgsF1EvNp9Tew+aSLEbhVuhpmZLYdiJzrMBt4HtgC2AraQtGrJWmVmZg2pqKAUET9MK88eAPwLuAp4s5QN60jK5PCkpLEpQ8PNaQJDfp1L04Ou8ySdlVe+naSHJc2S9Jik1SX1kHShpKnpfP9Z/l6ZmTW2Yi/ffZ9sksMQ4HmymXgPlrBdxdoEODIiJku6Evhewf6fRMQb6X7YBElbAU+SPSh7UERMlbQG8C5wJLAwIrZL6ylNljQ+Ip7LP2FKczQSoMcaFVl818ysbhU7JXxV4JfAtIj4sITt6awXImJy2r4WGFWw/5spiKwIDAAGkT1I+0pETAWIiLcAJO0NbCUp93xTH7LMEcsEJWd0MDMrnWKzhF9Y6oYsp8Kg8NFnSZ8FTiSblLFA0hiyTBBq5ThS+Q8i4t4StdXMzDpQdJbwKrW+pNyKuAcDD+XtW4PsIdqFkj4DfDmVPwmsLWk7gHQ/aUXgXuC7klZK5RtLWq0cnTAzs0xnct9VoyeA4ZIuB54GLgW+ChARsyTNAOaRZW+YnMrfl3QQcHGaQfguWW69PwBNZOmUBLxGtq5Sm5zRwcysexX98Gy1kdQE3BkRW1SqDc7oYGbWeV1+eNZaN+elhTSdelelm1ERzuxgZqVQs/eUImJ+saMkSccXPsNkZmbVp2aDUicdDzgomZlVuboLSpJWk3RXytYwV9LPgLWB+yXdn+p8ItODpC9Kui3vPHtJurUyvTAza0z1eE9pH+DliNgPIC2Dfjiwe0S8nuq0lunhL8DvJfWPiNfSMVcVntwZHczMSqfuRkrAHGBPSRdI2iWtOFvom5KmAzOAzYFBkU1DvAb4jqS+wFDgT4UHRsToiGiOiOYevcq2KryZWUOou5FSRPxV0hBgX+A8SePz97eT6QGykdEdwL+Bm6ospZKZWd2ru5GSpLWBxRFxLXARsC3wNrB6qtJWpgci4mXgZeB0YEwZm21mZtThSAnYErhQ0lLgA+C7pEtxkl6JiN1by/SQ5zqgf0Q83uEXOaODmVm3qruglBKqFiZVbQEuzqszop1T7Axc0f0tMzOzjtRdUOoKSdPILu39qJj6jZzRAZzVwcy6n4NSnogYUuk2mJk1spqc6KBMTbbdzMzaVjN/2CU1SXpC0iXAdOBQSXNS1oYL8uod3Eb5ovTs0jRJf5a0vaSJkp6VtH+qs7mkxyTNlDRb0sDy99TMrHHVTFBKNgGuBvYDzgb2AAYD20kalqaDX1BYno5dDZiYLtG9DZwD7AUcAPxXqnMM8JuIGAw0Ay8WNkDSyJSiqGXJ4taeyzUzs+VVa0Hp+YiYAmxHFmBeSw+4Xgfs2k45wPvAPWl7DjApIj5I202p/BHgx5JOATaIiHcLG+CMDmZmpVNrQemd9K429rdVDvBBfLyi4VLgPYCIWEqa8BERfwT2J1uN9l5Je3S5xWZmVrRaC0o5jwJfkNQvJVU9GJjUTnlRJG0IPBsRvwXGAVt1f9PNzKwtNTklPCJekXQacD/Z6OjuiLgdoK3yIh1ElpD1A+AffHyvqVXO6GBm1r308RUt66zm5uZoaWmpdDPMzGqKpGkR0dzavpocKVWLRs/oAM7qYGbdqyrvKUla1M3nGyZpUKXbYWZm7avKoFQCw4BOByUzMyuvqg5KKZ3QhSk7wxxJB+XtOzmVzZJ0fio7WtLUVHaLpF6SdiSb5n1hytSwUXrdk7I7PChp03T8ZyU9ks5xdmV6bWbWuKr9ntLXyDIzbA30A6ZKeiCVDQM+HxGLJa2V6t8aEVcASDoHODIiLpY0DrgzIm5O+yYAx0TE05I+D1xClgXiN8ClEXG1pGNba5CkkcBIgB5r9C9Nr83MGlS1B6WdgesjYgnwT0mTyLI2fAG4KiIWA0TEG6n+FikY9QV688l1lZDUG9gRuEn66Fnbnul9J+DrafsaspRFy4iI0cBogJ4DBnrqoplZN6r2oNRe5obWAsIYYFhEzJI0AtitlTorAG+m/HatcaAxM6uQqr6nBDwAHCSph6T+ZHnsHgPGA0dI6gWQd/ludeAVSSsBh+Sd5+20j4h4C3hO0jfSsZK0dao3GfhW2s4/3szMyqDaR0q3AUOBWWQjmJMj4h/APZIGAy2S3gfuBn4M/JQs1dDzZIlWV0/nuQG4QtIo4ECygHOppNOBldL+WcBxwB8lHQfc0lHjnNHBzKx7OaNDFzijg5lZ5zmjQ4k4o0NxnPXBzIpV7feUlpukUWml2usq3RYzMytOPY+Uvgd8OSKeyxVIWjEt/mdmZlWoLkdKki4DNgTGSVooabSk8cDVaSbfhSlrw2xJ/5l33El55WdVrANmZg2qLkdKEXGMpH2A3YHvA18Fdo6Id1NGhoURsZ2knsDkFLAGptf2ZM9BjZO0a0Q8kH9uZ3QwMyudugxKrRgXEe+m7b2BrSQdmD73IQtGe6fXjFTeO5UvE5Sc0cHMrHQaJSi9k7ct4AcRsUwKIklfAs6LiMvL2jIzM/tIXd5T6sC9wHdT1gckbSxptVR+RMqNh6R1JH26gu00M2s4jTJSyvcHoAmYriwj62tk+fLGS9oMeCQlal0EfAd4ta0TOaODmVn3ckaHLnBGBzOzznNGhxJxRoeucaYHMytUtntKks6UdGIJztsk6dvdfV4zMyu/epjo0AQ4KJmZ1YGSBiVJP5H0lKQ/A5ukssGSpqSsCbdJWlPSpyVNS/u3lhSS1k+fn5HUS9IYSb+V9LCkZ/OeMzof2EXSTEk/lLSKpKskzZE0Q9Lu6Tx3S9oqbc+QdEbaPlvSUZJ2kzRR0s2SnpR0nfKWpjUzs9IrWVCSNIRswbxtgK+RLWMOcDVwSkRsRbbm0c8i4lVgFUlrALsALWSBZgPg1dyy58AAsiXSv0IWjABOBR6MiMER8SvgWICI2BI4GBgraRWyh2B3Sd/xIdnS56TzPZi2twGOBwaRpSnK1cnv10hJLZJalixe2KWfkZmZLauUI6VdgNsiYnFa7XUcsBrQNyImpTpjyVaTBXiYLAjsCvw8ve/CxwED4P8iYmlEPA58po3v3Rm4BiAiniRb8G/jdJ5d0/67gN5p5dqmiHgqHftYRLwYEUuBmWSXBpcREaMjojkimnv06tOpH4iZmbWv1LPvOjPf/EGyILQBcDtwSjr+zrw67+Vtt3Vpra3yqUAz8CxwH9APOBqY1sb5l+DZiWZmZVXKkdIDwAGSVpW0OllS1HeABZJ2SXUOBSbl1f8O8HQaqbwB7AtM7uB73ubjZc9z5zkEsmwNwPrAUxHxPvAC8E1gClkQPJFlR2JmZlZBJRsJRMR0STeSXQZ7no//+A8HLkuXzp4FDk/156d5BbkEqA8B60bEgg6+ajbwoaRZwBjgknT+OWT3jkZERG4E9CDwxYhYLOlBYF26EJSc0cHMrHs5o0MXOKODmVnnOaNDiTijQ/k4+4NZY6iHh2eLIqmvpO9Vuh1mZta2hglKQF+gU0FJmUb6GZmZVVTN/MGVdFjKAjFL0jWS+hWbP2YAAAnPSURBVEu6RdLU9Nop1TtT0pUpO8OzkkalU5wPbJQyP1yY6p6Ujp0t6axU1iTpCUmXANOB9SrRXzOzRlQT95QkbQ78BNgpIl6XtBbwO+BXEfFQSkl0L7BZOmRTYHeyqeJPSbqULPPDFhExOJ1zb7Llzrcne7ZpnKRdgb+TpUQ6PCI+MbKSNBIYCdBjjf6l6rKZWUOqiaAE7AHcHBGvA0TEG5L2BAblpadbIz0PBXBXmgb+nqRXaT37w97pNSN97k0WpP4OPB8RU1prSESMBkYD9Bww0FMXzcy6Ua0EJfHJ7BArAEMj4t1lKmZBqpjMDALOi4jLC45vInvI18zMyqxW7ilNAL4p6VMA6fLdeOD7uQqSBndwjsLMD/cCR0jqnY5fR9Knu7XVZmbWKTUxUoqIeZLOBSZJWkJ2yW0U8HtJs8n68QBwTDvn+JekyZLmAn+KiJMkbQY8kkZXi8jSHC0ptl3O6GBm1r2c0aELnNHBzKzznNGhRJzRwWqdM2VYtamVe0ptSs8Vze1E/RGS1i6i3pi81W3NzKwMaj4oLYcRQIdByczMyq9eglIPSVdImidpfFrDabCkKSlbw22S1kwjn2bgupTZYVVJQyRNkjRN0r2SBlS6M2ZmjapegtJA4PcRsTnwJvB14GrglIjYCpgD/CwibgZagENSZocPgYuBAyNiCHAlcG57XyRppKQWSS1LFi8sXY/MzBpQvUx0eC4iZqbtacBGQN+IyK1qOxa4qZXjNgG2AO5L08J7AK+090XO6GBmVjr1EpQKMzj0LfI4AfMiYmj3N8nMzDqrXi7fFVoILJC0S/p8KJAbNeVndngK6C9pKICklVLyVzMzq4B6GSm1ZjhwmaRewLPA4al8TCp/FxgKHAj8VlIfsp/Hr4F5xXyBMzqYmXUvZ3ToAmd0MDPrPGd0KBFndDCzRlTKTCD1ek/pI5KGSRqU93mZjA5phdpWI7aZmZVX3QclYBgwKO/zCJzRwcysKlV1UJK0mqS7JM2SNFfSQW1lYJB0tKSpqe4tknpJ2hHYH7gwZXA4hYKMDgXft7ekRyRNl3RTbq0lMzMrj6oOSsA+wMsRsXVEbAHcQ9sZGG6NiO0iYmvgCeDIiHgYGAecFBGDI+IC8jI65K9aK6kfcDqwZ0Rsm+qdUNggZ3QwMyudap/oMAe4SNIFwJ3AAtrOwLCFpHPIHpztTbaybGfsQHaZb3I698rAI4WVnNHBzKx0qjooRcRfJQ0B9gXOA+6j7QwMY4BhETFL0ghgt05+nYD7IuLg5W+xmZl1RVVfvkuz5BZHxLXARcDnaTsDw+rAK5JWAg7JO01+BofWPudMAXaS9Ll07l6SNu7WDpmZWbuqeqQEbEk2SWEp8AHwXbLM3q1lYPgp8CjwPNllv1zguQG4QtIosuwNY1g2owMAEfFaGmFdL6lnKj4d+GubjXNGBzOzbuWMDl3gjA5mZp3XXkaHqr58Z2ZmjcVByczMqoaDkpmZVQ0HJTMzqxoOSmZmVjUclMzMrGo4KJmZWdVwUDIzs6rhh2e7QNLbwFOVbkcZ9QNer3Qjysj9rW/ub+VsEBH9W9tR7WmGqt1TbT2VXI8ktbi/9cv9rW+10l9fvjMzs6rhoGRmZlXDQalrRle6AWXm/tY397e+1UR/PdHBzMyqhkdKZmZWNRyUzMysajgoLSdJ+0h6StLfJJ1a6fZ0B0lXSnpV0ty8srUk3Sfp6fS+ZiqXpN+m/s+WtG3lWt55ktaTdL+kJyTNk3RcKq/X/q4i6TFJs1J/z0rln5X0aOrvjZJWTuU90+e/pf1NlWz/8pLUQ9IMSXemz3XbX0nzJc2RNFNSSyqrud9nB6XlIKkH8Hvgy8Ag4GBJgyrbqm4xBtinoOxUYEJEDAQmpM+Q9X1geo0ELi1TG7vLh8CPImIzYAfg2PTfsF77+x6wR0RsDQwG9pG0A3AB8KvU3wXAkan+kcCCiPgc8KtUrxYdBzyR97ne+7t7RAzOex6p9n6fI8KvTr6AocC9eZ9PA06rdLu6qW9NwNy8z08BA9L2ALIHhgEuBw5urV4tvoDbgb0aob9AL2A68HmyJ/xXTOUf/V4D9wJD0/aKqZ4q3fZO9nNdsj/EewB3Aqrz/s4H+hWU1dzvs0dKy2cd4IW8zy+msnr0mYh4BSC9fzqV183PIF2q2QZ4lDrub7qUNRN4FbgPeAZ4MyI+TFXy+/RRf9P+hcCnytviLvs1cDKwNH3+FPXd3wDGS5omaWQqq7nfZ6cZWj5qpazR5tbXxc9AUm/gFuD4iHhLaq1bWdVWymqqvxGxBBgsqS9wG7BZa9XSe033V9JXgFcjYpqk3XLFrVSti/4mO0XEy5I+Ddwn6cl26lZtfz1SWj4vAuvlfV4XeLlCbSm1f0oaAJDeX03lNf8zkLQSWUC6LiJuTcV129+ciHgTmEh2L62vpNw/TvP79FF/0/4+wBvlbWmX7ATsL2k+cAPZJbxfU7/9JSJeTu+vkv2jY3tq8PfZQWn5TAUGppk8KwPfAsZVuE2lMg4YnraHk917yZUflmbx7AAszF0mqAXKhkT/AzwREb/M21Wv/e2fRkhIWhXYk2wCwP3AgalaYX9zP4cDgb9EuvlQCyLitIhYNyKayP7//EtEHEKd9lfSapJWz20DewNzqcXf50rf1KrVF7Av8Fey6/I/qXR7uqlP1wOvAB+Q/UvqSLLr6hOAp9P7WqmuyGYgPgPMAZor3f5O9nVnsssVs4GZ6bVvHfd3K2BG6u9c4IxUviHwGPA34CagZypfJX3+W9q/YaX70IW+7wbcWc/9Tf2alV7zcn+TavH32WmGzMysavjynZmZVQ0HJTMzqxoOSmZmVjUclMzMrGo4KJmZWdVwUDJrcJKOl9Sr0u0wA688a9bwUtaD5oh4vdJtMfNIyawGSDosrXszS9I1kjaQNCGVTZC0fqo3RtKBecctSu+7SZoo6WZJT0q6Lj3NPwpYG7hf0v2V6Z3Zx5yQ1azKSdoc+AlZws3XJa0FjAWujoixko4AfgsM6+BU2wCbk+U4m5zO91tJJ5Ctw+ORklWcR0pm1W8P4OZc0IiIN8jWAvpj2n8NWdqkjjwWES9GxFKytEpNJWirWZc4KJlVP9HxsgK5/R+S/r9OSWdXzqvzXt72EnylxKqQg5JZ9ZsAfFPSpwDS5buHybJfAxwCPJS25wND0vZ/ACsVcf63gdW7q7FmXeF/KZlVuYiYJ+lcYJKkJWTZvkcBV0o6CXgNODxVvwK4XdJjZMHsnSK+YjTwJ0mvRMTu3d8Ds+J5SriZmVUNX74zM7Oq4aBkZmZVw0HJzMyqhoOSmZlVDQclMzOrGg5KZmZWNRyUzMysavw/6exuClSYTy8AAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.barh(df_common_words['desc'], df_common_words['count'])\n",
    "plt.xlabel('count')\n",
    "plt.ylabel('words')\n",
    "plt.title('top 20')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到这次的top 20清晰了很多，如最多的seattle、hotle、center等，这里还是一个词一个词去分的，词组起来连贯后意思会不同，如在机场的便利店附近的酒店，这个酒店除了在便利店附近，还得是机场附近。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_top_n_words(corpus, n=None):\n",
    "    # 获取某数据中最长出现的n个词，并增加停用词，增加连贯词\n",
    "    vec = CountVectorizer(stop_words='english',ngram_range=(2,2)).fit(corpus)  # 增加两次词连贯的\n",
    "    bag_of_words = vec.transform(corpus)\n",
    "    sum_words = bag_of_words.sum(axis=0)\n",
    "    words_freq = [(word, sum_words[0,idx]) for word,idx in vec.vocabulary_.items()]  # 得到词及对应出现的次数\n",
    "    words_freq = sorted(words_freq, key=lambda x:x[1],reverse=True)\n",
    "    return words_freq[:n]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 1.0, 'top 20')"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeAAAAEWCAYAAAC+H0SRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nOzdebzd073/8ddbzEK0pX5oCWqeQhKtsaGqvapoUVVFSiltKS2tDldVB5SrLWoI1WgpamqVXomqiJkMZDD1lriG1HBLiHl4//5Yny3f7Ox9zj7JOdnnJJ/n45HH2Xt913d913efw9rrO7y/sk1KKaWU5q9F2t2BlFJKaWGUA3BKKaXUBjkAp5RSSm2QA3BKKaXUBjkAp5RSSm2QA3BKKaXUBjkAp5RSSm2QA3BKqW0kTZO0Yw+1/SlJt0p6QdK/JJ0nadnK8iUkXSDpxVj+zZ7oR0rN5ACcUlpQDQB+AqwCrA98ADilsvx4YG1gdWB74NuSPjmf+5gWYjkAp5TaQtLvgdWAv0iaKenbUb6rpKkxcx0jaf3KOtMkfVfS/ZKel/RbSUs2at/2H2xfb/sV288D5wFbV6rsD/zY9vO2H4jlw3tod1OaQw7AKaW2sL0f8L/Ap233t/1zSesAlwBHAisCf6UM0ItXVt0X+ASwFrAO8IMWN7kdMBVA0nsoM+P7KsvvAzac+z1KqWtyAE4p9SZ7A9fZvsH2m8CpwFLAVpU6Z9p+3Pa/gZ8C+3TWqKSPAwcAx0VR//g5o1JtBrAsKc0nOQCnlHqTVYDHam9svwM8DqxaqfN45fVjsU5Tkj4C/AHY0/bDUTwzfi5Xqboc8NLcdTulrssBOKXUTvWPY3uKclEUAJIEfBB4slLng5XXq8U6DUnaDLgGOND2je9utJwTng5sWqm+KXGIOqX5IQfglFI7PQ2sWXn/R+BTkj4maTHgW8DrwO2VOl+T9AFJ7wW+B1zWqGFJGwHXA4fb/kuDKr8DfiDpPZLWAw4GRs7rDqXUqhyAU0rtdCJlEHxB0tG2HwK+CJwBPAd8mnKR1huVdf4AjAYeiX8/adL2tygXcv0mrrKeKak6w/0h8E/KYeybgVNsX9+N+5ZSh2TXHwFKKaXeSdI04Mu2/9buvqQ0r3IGnFJKKbVBDsAppZRSG+Qh6JRSSqkNcgacUkoptcGi7e5A6jtWWGEFDxw4sN3dSCmlPmX8+PHP2V6xvjwH4NSygQMHMm7cuHZ3I6WU+hRJjzUqz0PQKaWUUhvkAJxSSim1QQ7AKaWUUhvkAJxSSim1QQ7AKaWUUhvkAJxSSim1QQ7AKaWUUhvkAJxSSim1QQZxpJZNfnIGA4+9rt3d6HOmnfSpdnchpdQL5Qx4Lkk6UtLSlfd/lbR8O/tUJWl3SRu0ux8ppZQaywF47h0JvDsA297Z9gtt7E+93YEuDcCS8ohISinNJ31mAJa0v6RJku6T9PsoW13SjVF+o6TVonykpNMl3S7pEUl7RvllknautDlS0h6S+kk6RdI90dZXYvkwSWMkXSHpQUkXqzgCWAW4SdJNUXeapBXi9TclTYl/R0bZQEkPSDpP0lRJoyUt1WA/V5J0deznfZK2ivIvSrpb0r2SzpXUL8pnSvpp1L0z1t8K2BU4JeqvFf+ulzRe0i2S1qt8BqfFfpzcI7+8lFJKc+gTA7CkDYHvAzvY3hT4Riw6E/id7U2Ai4HTK6utDGwD7AKcFGWXAntHm4sDHwP+ChwEzLA9FBgKHCxpjVhnM8psdwNgTWBr26cDTwHb296+rq+DgS8BHwY+Em1tFovXBn5te0PgBWCPBrt7OnBz7OfmwFRJ60e/t7Y9CHgb2DfqLwPcGfXHAgfbvh24BjjG9iDb/wRGAIfbHgwcDZxV2eY6wI62v1XfGUmHSBonadzbr8xo0N2UUkpzo68cctwBuML2cwC2/x3lWwKfjde/B35eWedPtt8B7pe0UpT9N3C6pCWATwJjbb8qaSdgk9pMGRhAGSzfAO62/QSApHuBgcCtHfR1G+Bq2y/HOlcB21IGxEdt3xv1xkdbjfZ1/9jPt4EZkvYDBgP3SAJYCngm6r8BXFtp8+P1DUrqD2wFXB7rAyxRqXJ5bGsOtkdQBm+WWHltd7DfKaWUuqCvDMACWvmff7XO63XrY/s1SWOAT1BmlJdUlh9ue9RsG5WG1bXzNp1/ZupgWX1bcxyC7qDNC21/t8GyN23X9rtZ/xYBXojZcyMvt9iPlFJK3aRPHIIGbgQ+J+l9AJLeG+W3A5+P1/vS8cy05lLKIeJtgdqAOwo4TNJi0f46kpbppJ2XgGUblI8Fdpe0dLTxGeCWFvpVcyNwWPSjn6TlomxPSe+P8vdKWr3V/tl+EXhU0l6xviRt2oU+pZRS6mZ9YgZse6qknwI3S3obmAgMB44ALpB0DPAsZWDtzGjgd8A1tt+IsvMph4MnqByjfZZyFXFHRgD/LWl69Tyw7QmSRgJ319q2PVHSwBb6BuX89ghJB1FmtIfZvkPSD4DRkhYB3gS+BjR8yHO4FDgvLhjbk/IF5exoZ7FYfl+LfQJg41UHMC7vaU0ppW6hWUcvU+rYkCFDPG7cuHZ3I6WU+hRJ420PqS/vEzPg1DtkElbPy9SslBYeveYcsKQhkk7vvOY8beP2+DlQ0hd6clst9OXd+4bryneVdOw8tDtbQldKKaXeqdcMwLbH2T5iXttRB2lOtreKlwOBtg7Azdi+xvZJnddsaraErpRSSr1TjwzAMcOcUnl/tKTj4/UYSSdHqtPDkraN8mGSrpW0SMwOl6+s/z+R8LSipCtVEqvukbR1LD9e0ghJo4HfSdqwkho1SdLaUW9mNHkSsG0sPyqSoQZVtnebpE3q9umvtTJJEyUdF69/LOnLkvqrpHFNkDRZ0m6xfBlJ16kkVU2RtHel2cMr9WvJVMMlnRmvmyV6LSLpLJVErWujb3uqcULXPtH+FEknV/ZnjgStufx1p5RSmgvtmgEvansLymzth9UFEZ7xZ8rtO0j6MDDN9tPAr4BfRGLVHpSrl2sGA7vZ/gJwKPCruO91CPBE3faPBW6JlKhfRDvDY3vrAEvYnlS3zljKoL0c8BawdZRvQ7nN6DXgM7Y3B7YH/iuuqP4k8JTtTW1vBFxfafO5qH82JZ2qkUaJXp+lzOI3Br5MCSShPqFL0iqUeMkdgEHAUEm1q7vnSNBqtHFlElZKKfWIdg3AV8XPZmlQlxGRkZT7fC+L1zsCZ6okUl0DLCepdi/uNbZfjdd3AN+T9B1g9Up5M5cDu6jcB3wgMLJBnVuA7SiD4XVA/zjXOtD2Q5SwjJ9JmgT8DVgVWAmYDOwYs/5tbVdHsc4+B4hEL9v3R3tEHy6P8n8BNzVZdygwxvaztt+ixHVuF8vqE7Qabt/2CNtDbA/pt/SAJptJKaXUVT01AL9V1/aSdctriVDNkpvuAD4kaUXK/bi1gWoRYMuYuQ6yvartl2LZu2lOtv9AeRjBq8AoSTt01FnbrwA3ALsBnwP+0KDaPZTZ9LaUGeNEyqxxfCzfF1gRGBwz76eBJW0/TJmdTwZOrB26bvFzqNaBWSlbHaVtVXVUr5UErZRSSj2kpwbgp4H3S3qfSu7yLl1ZOQaGq4HTgAds/18sGg18vVavet62StKawCNxSPYaYJO6Ko1SrM6nPAjhnkrWdLVPbwCPUwboOykz4qOZlXI1AHjG9puStgdWj76sArxi+yLgVMoDFubVrcAecS54JWBYk327C/iopBVUnp60D3BzN2w/pZTSPOqRWU8MQidQBoBHgQfnopnLKLPO4ZWyI4Bfx2HeRSkz0UMbrLs38EVJbwL/Ak6oWz4JeEvSfcBI27+wPV7Si8BvO+jTLcDHbL8i6RbgA8wagC8G/iJpHHAvs/Z5Y8pjAd+hJFgd1vmud+pKypOcpgAPUz7n2qHt2RK6JH2XcohawF9t/3luN5pJWCml1H0yCSvETHUMsF5cCNarSepve6ZKPvbdlEcV/qsnt5lJWCml1HXKJKzmJO0P/BT4Zl8YfMO1cavW4sCPe3rwhUzCmh8yCSulhUevCeJoJ9u/s/1B25d3dV1Ju0vaoPJ+eMyma+/HSJrjm8+8sj0sLkTbwPbIBv2arR8ppZR6lxyA593uwAaV98MpYRjtNpwu9kMdpIillFLqXgvtANwsoUrSYEk3SxovaZSklaP8YJX0rftU0riWlrQV5XanUyJV6zuUW5UujvdL1W1zJ0l3RPrV5ZL6N+jXhyT9LbYzQdJaUX5MbH+SpB9F2UBJD0g6L1KxRktaKhKzZutHB/s1RtLPJN1MeRRiSiml+WChHYBpkFAVQRxnAHvaHgxcQDk3DHCV7aGRHPUAcJDt2ym3OR0Th4NPBsYB+8b7dwNAVB688ANgx0i/Ggd8s0G/LgZ+HdvZCpguaSdgbWALSqLVYEm1QI21o/6GwAvAHravqPaDcl92s/0CWN72R23/V31nMgkrpZR6xsJ8yHEycGrkI19r+xZJGwEbATeUFEn6AdOj/kaSfgIsD/QHRnVxex+hHKq+LdpenBI48q5I9VrV9tUAtl+L8p2AnSjhH8T21wb+F3jU9r1R3izRat0O9gtmJY3NwfYIyq1NLLHy2nnJfEopdZOFdgC2/bCkwcDOlISq0ZTwj6m2t2ywykhgd9v3SRrO7OEXrRBwg+19OqnTrPxE2+fOVigNZPakrLeB2Q57V9Zvtl9QSRFLKaU0fyy0h6CbJFQ9BKwoacuos5ikDWOVZSmHgxejxE7W1KdqNUrZgpKetbWkD0XbS6s8+OFdtl8EnlA8MEHSEpE3PQo4sHbOWNKqkt7fyS5W+9HRfqWUUmqDhXYGTIOEKttvxAVMp0saQPl8fglMBf6Tkjj1GOXwdW1wuxQ4T+VRgHtSZsrnSHqVeEoRgO1nY+Z8iUo8J5Rzwg/X9Ws/4NxIEnsT2Mv2aEnrA3fEIeSZwBcpM95m6vvRbL9alklYKaXUfTIJK7Usk7BSSqnrMgkrzbNMwpq/MhUrpQXbQnsOeG50d+pVplWllNLCKwfgrunu1Kt5XT+llFIftVAMwL0x9apJWtVxsd0pkkYorrhqlI6l4pSoO7myT8Nin/4o6WFJJ0naV9LdUa+WrPVpSXdJmhhtr9Szv4WUUkpVC8UATC9MvapPq4r1z4ztbkS5n3eXqD5HOhbwWUoq1qbAjpQvBitH/U0psZIbU66qXsf2FsD5wOFR51bgI7Y3o1zJ/e1GH1wmYaWUUs9YWC7C6nWpV01sL+nbwNLAe4GpksbQOB1rG+AS228DT6tkOQ8FXgTusT096v0TGF35HLaP1x8ALotBe3Hg0UYdyiSslFLqGQvFANxLU69mX0FaEjgLGGL7cUnHA0vScTpWM9V0rHcq799h1u/8DOA029dIGgYc32pfU0opzbuF4hB0b0y9arD+kvHzuThfvCd0mI41FthbUj9JKwLbAXe39okAMAB4Ml4f0IX1UkopdYOFYgZM7029ql//vNjeNOCeSr050rEoM/gtgfsAA9+2/S9J67X4mRwPXC7pScoXhjU6WyGTsFJKqftkElZqWSZhpZRS12USVppnmYQ1f2USVkoLtoXiHHB3kjRQ0pQmy06QtGMn6x8v6eie6d3ckTQyDsenlFKaT3IG3I1sH9fuPqSUUuobcgY8d/pJOk/SVEmjaylY1ZmkpJ0lPSjpVkmnS7q2sv4GKrnRj8QFXbOJK5tHVlKujoryMZJ+Ken2WLZFlC8j6YJI0ZooabdKO6dE+SRJX4lySTpT0v2SrgM6e7ZwSimlbpYz4LmzNrCP7YMl/RHYA7iotjDu6T0X2M72o5IuqVt/PUogxrLAQ5LOtv1mZfkgSvjGRtHe8pVly9jeStJ2lPSujYDvA3+3fWDUvVvS3yi3UM2wPTSuxr4t7oHeDFiXcnX4SsD90dYcJB0CHALQb7kVu/gxpZRSaiZnwHPnUdv3xuvxwMC65esBj9iupUvVD8DX2X7d9nPAM5RBsOoRYE1JZ0j6JCXdquYSANtjgeViwN0JOFbSvcAYyj3Fq0X5/lF+F/A+ypeH7YgULdtPAX9vtqO2R9geYntIv6UHNP9EUkopdUnOgOdONWnqbUpuc1VHKVWN1p/t92D7eUmbAp8AvgZ8DjiwtriuLcf29rD90GydKDmYh9seVVe+c4N2UkopzUc5A+4ZD1JmsAPj/d5dWTke5rCI7SspoSCbVxbXnnq0DeXw8gxKVvXhMeAiabOoOwo4LBK9kLSOpGUoKVqfj3PEKzMrHzqllNJ8kjPgHmD7VUlfpTx16Tm6FhEJsCrwW0m1L0jfrSx7XtLtwHLMmhX/mJLiNSkG4WmUJymdTzk8PiHKn6U80/hqYAdK6tbDwM2tdCqTsFJKqftkElYPkdTf9swY+H4N/MP2L+axzTHA0bbbEkeVSVgppdR1mYQ1/x0s6QDKo/4mUq6K7tMyCau9MhkrpQVLngOeB5J2l7RB5f3wePISMdt9Adjf9r62X5mH7RwqaX/bw7p79hvJXl/ozjZTSil1LgfgebM7sEHl/XBgle7eiO1zbP+uvlzSPB3BiPUHAjkAp5TSfLZQHoKOK4H/CHwA6Af82PZlkgYDpwH9geeA4banSzqYEkaxOPA/lMcDDgJ2BT4q6QeU+3OHABfXP54wtrkT8CNgCeCfwJdsz6yrM8d2bL8i6Xhgpu1T4zzw7cDWwDWSNgZeAzak3E/8TdvXRhjI2dGnt6L8pnhM4qco9wovAywNrB/3Cl84r+epU0optWZhnQF/EnjK9qaRNnV93KpzBrCn7cGUZKifRv2rbA+1vSnwAHCQ7duBa4BjbA+yfTIwDtg33r9a21jcVvQDYEfbm0e9bzbo1xzbadL/5W1/1PZ/xfuBwEcpA+s5Mfh+DcD2xsA+wIVRDuXLwQG2dwCOBW6JPs8x+Eo6RNI4SePefmVGBx9pSimlrlgoZ8CU229OlXQycK3tWyRtRIl1vCFup+0HTI/6G0n6CbA8ZXY8qkGbHfkI5VD1bdH24sAdDeq1up3L6t7/0fY7wD8kPUJJ4tqG8oUC2w9KegxYJ+rfYPvfrXTc9ghgBMASK6+dl8ynlFI3WSgHYNsPx+HmnYETIx/5amCq7S0brDIS2N32fXEId1gXNynKoLdPJ/Va3c7Lde+bpWM1U79+Siml+WyhPAQdVyq/Yvsi4FRK0tRDwIqStow6i0naMFZZFpgeh6n3rTT1Uixr9r7mTmBrSR+KtpeWtE6Des2205m9JC0iaS1gzdiXsbU2YlurRXm9Zn1OKaXUgxbKGTDlKUCnSHoHeBM4zPYbKo8SPF3SAMpn80tgKiUO8i7gMcrh69qAdSlwnsojBfekzGDPqb8Iy/azMaO9JJ5KBOWc8MN1/Wq2nc48REmzWgk41PZrks6KvkymXIQ13PbrcQi8ahLwlqT7gJEdXYSVSVgppdR9Mgmrj5M0knIe+4qe3lYmYaWUUtdlElaaZ5mE1btkMlZKfVvbzwHHgwU6q3OkpKXnQ19mS4WSNETS6T2wnWlxa1Kr9c+vJm5V2R4+L7NfScMkbTW366eUUpo7bR+AbbfyP/8jKYERLZPUby66M5BKKpTtcbaPmIt2upXtL9u+v758Lvexuv6ilCutcwBOKaX5rO0DsKSZ8XOYpDGSrpD0oKSLVRxBiXe8SdJNUXcnSXdImiDpckn9o3yapOMk3Uq5MniMpJMl3S3pYUnbRr2Bkm6J9SdUZoAnAdtKulfSUdGna2Od90r6k6RJku6UtEmUHy/pgtjWI9Hf2r79SdJ4SVMlHdLCZ3F2hF5MlfSjSvkYSUNqn5ekEyTdBWwZ+1zbx7srV1qvLunG6O+NklaL8pGSTovP8jLgUOCo2Odt5/oXmVJKqUvaPgDX2Ywy292AcjvN1rZPB54Ctre9vTpPlXrN9ja2L433i9reItr9YZQ9A3w81t8bqB1m7igV6kfARNubAN8DqtnM6wGfALYAfhi3EQEcGKlaQ4AjJL2vk/3/fpyo34QScblJgzrLAFNsf9j2rVH2YuzjmZQrt4nXv4v+XlzZRyiBHDva3gM4B/hF7PMt9RtTJmGllFKP6G0D8N22n4hUp3sph4TrVVOl7gUOAFavLK9Piboqfo6vtLcY5fahycDlzP5AhWa2AX4PYPvvwPvidiWA62y/bvs5yuC+UpQfEbf33Al8EFi7k218TtIEyuMLN2zSr7eBK+vKLqn8rN3+tCXwh3j9++h/zeW23+6kL0BJwrI9xPaQfksP6HyFlFJKLeltV0G/Xnn9No3711mqVH3KU63NantHAU8Dm1K+hLzWQt8aJUvV7uGao9+ShgE7AlvGAxXGUB6A0LhxaQ3gaGCo7efj9qJG9V9rMHi6yetmdTIJK6WU2qy3zYCbqaY1tZoq1ZEBwPSYae9HyX2u3069arLUMOA52y92so3nY/BdjzJz78hylIFxhqSVgP9oZUfC3pWftYzp24HPx+t9gVvrVwqZhJVSSm3Q22bAzYwA/lvS9DgPPJzOU6U6chZwpaS9gJuYNSOcLRWKcii45njgt5ImAa9QDn135Hrg0Kj/EOWLQ1OR/zyRkrz1CHBbF/ZnibgoaxHKk48AjgAukHQM8CzwpSbr/gW4QtJuwOGNzgPXZBJWSil1n0zC6uMkTQOGxPnnHpVJWCml1HWZhJXmWSZh9S6ZhJVS39ZXzgF3SdznO6Ub2llFUo9nLM8L2wMbzX7rU71SSin1LgvkANxdbD9le89292MuDaSS6tWKeU3WSiml1LoFeQBeVNKFkQR1hSJLWpUcZpWs5zHx+qORBnWvpImSlq3OpCUNl3SVpOsl/UPSz2sb6iCZ6yRJ90cfTo2yvSRNkXSfpLGNOi7p25ImR52Tomyt2PZ4lRSv9aJ8pKTTJd0eSVy1Lwz1qV79JJ0i6Z7oz1di/WGSbpL0B8ojEFNKKc0HC/I54HWBg2zfJukC4KvAqR3UPxr4WtTvT+N7gwdR0rpeBx6SdAbwKrOSuV6W9B3gm5LOBD4DrGfbkpaPNo4DPmH7yUrZuyT9B7A78OG4hem9sWgE5Vm//5D0YcqV3DvEspUpQRvrAdcAV1BSvY62vUu0ewgww/bQuHr8NkmjY/0tgI1sP9qgP4cAhwD0W27FDj6+lFJKXbEgz4Aft127leciZk+CauQ24DSVLOflbb/VoM6NtmfYfg24n5LA1SyZ60XKIH6+pM9Sbl2qbWekpIOZdf9x1Y7Ab22/AmD73/GFYCvg8tjGuZRBt+ZPtt+JBzasNEeLxU7A/rH+XcD7mJXMdXejwTe2n0lYKaXUAxbkGXD9/VW1928x64vHu0lTtk+SdB2wM3CnpB2ZcxbcKKmraTKXpC2Aj1ECMb4O7GD70JjBfgq4V9Ig2/9XXa1B3xcBXrA9qMm+VvvVKLGrVn647VF1fRxGJmOllNJ8tyDPgFeTVMtF3odZSVDTgMHxeo9aZUlr2Z5s+2TKAx7Wa3E7DZO5YtY6wPZfKQ+CGFTZzl22jwOeo2REV40GDqycs35vJG49GsEhqNi0k37VJ1yNAg5TPCgi+rhMi/uYUkqpmy3IM+AHgAMknQv8Azg7yn8E/EbS9yiHYmuOlLQ9ZWZ7P/DfzH6YtyHbzzZJ5noJ+LOkJSmzz6Ni2SmS1o6yG4H76tq7XtIgYJykN4C/Up6+tC9wtqQfUB4mcWn9unXqU71+RbkyeoIkUdKxdu9s/6oyCSullLpPJmGllmUSVkopdV0mYaV5lklYvVsmY6XUtyzI54B7LUkfjHtvH5A0VdI3GtQ5WpJr9yzXLRsetzmllFLqo3IG3B5vAd+yPUHSssB4STfEbURI+iDwceB/52enJC3a5ParlFJK3SxnwG1ge7rtCfH6JcoFY6tWqvwC+DZz3o5UtUqTVK59IkVriqSTK+UzK6/3lDQyXo+UdJqkm4CTSSmlNF/kDLjNJA2kpGvdFe93BZ6M5wN3tGqjVK63KYPoYOB5YLSk3W3/qZNurENJ8nq7Qf8yCSullHpAzoDbKO4VvhI40vaLce/v9ylxlZ1plMo1FBhj+9k4lHwxsF0LbV3eaPCFTMJKKaWekgNwm0QgxpXAxbaviuK1gDWA+yRNAz5AuW/3/zVoolkqVzPVw9lL1i3LJKyUUprPWhqAJS0jaZF4vY6kXWuJSqnrIgjjN8ADtk+rlUcS1/vjGb8DgSeAzW3/q8Wm7wI+KmkFlUcL7gPcHMuelrR+/B4/0207k1JKaa60eg54LOXRdu+hpDeNA/ampDOlrtsa2A+YHA9HAPhexFbONdvTJX0XuIkyG/6r7T/H4mOBa4HHgSlA/662n0lYKaXUfVpKwpI0wfbmkg4HlrL9c0kTbW/W811MvUUmYaWUUtfNaxKW4sEG+wIHdXHdtIDIJKy+K1OyUup9Wr0I60jgu8DVtqdKWpNymHOBJGmgpCldXGe4pFVaqDdS0p5z37sO2x4o6QuV94Mk7VzXx0zQSimlXqClAdj2zbZ3jUf1YfsR20f0bNf6nOFApwNwDxsIfKHyfhDl+cYppZR6mQ4PI0v6Cx2kMdnetdt71Hv0k3QesBXwJLCb7VfjUYHnAEsD/wQOBD4GDAEulvQqsCWwAXAa5WKn54Dhtqc325ikI4BDKTGV99v+fDyv9wxgY8rv6njbf47wjt8Dtef5ft327cBJwPpxYdclwNeApSRtA5xYt70VYz9Wi6Ijbd82V59USimlLutsBnwq8F/Ao8CrwHnxbyblStoF2drAr21vCLwA7BHlvwO+Y3sTYDLwQ9tXUK4M39f2IMogegawp+3BwAXATzvZ3rHAZtHuoVH2feDvtocC21OeJbwM8AzwcdubU65GP73Sxi22B8XRiuOAy+L9ZXXb+xXwi2h7D+D8Rp2SdIikcZLGvf3KjE52IaWUUqs6nAHbvhlA0o9tVxOV/iJpbI/2rP0etV27RWg8MFDSAGD52ucCXAhc3mDddYGNgBsiTrIf0HT2GyZRZtB/AmrRkTsBu0o6Ot4vSZmxPgWcGbPxtylRkl21I7BBJe5yOUnLRjb1u2yPAEYALLHy2vnw6JRS6iatXsm8oqQ1bT8CIGkNYEEPBq5PmlqqC+sKmGp7yy6s8wCa0VcAACAASURBVClKbOSuwH9K2jDa2cP2Q7M1Lh0PPA1sSjmK8VoXtlOzCLCl7VfnYt2UUkrzqNWroI8CxkgaI2kM5QroOZ5hu6CzPQN4XtK2UbQfs5KmXgKWjdcPUb60bAkldjIG1IYineqDtm+iPAVpecq541HA4ZGchaTafdcDgOm234k+9GvQh0bvq0YDX6/0YVAHu55SSqmbdToDjsHhRco50fWi+EHbrzdfa4F2AHBOPDjhEeBLUT4yymsXYe0JnB6HrRcFfglMbdJmP+CiqCvKudkXJP041psUg/A0YBfgLOBKSXtRvgzVspwnAW9Jui/6cyFwbFyUNdtFWMARwK8lTYr+jWXWueeGMgkrpZS6T6tJWHd08XBqWgBlElZKKXXdvCZhjZa0B3CVWxmx0wIpk7D6rkzCSqn3afUc8DcpV/u+IelFSS9JerEH+9WnSJrZyfIuJ2vNY3+GSDq985oppZTapaUZsO1mF/KkXsj2OMp9ySmllHqpVmfAxDOAT41/u/Rkp/oqSf0l3ShpgqTJknZrUGdNSRMlDZXUT9Ipku6RNEnSVxrUn232LOnouA2JuCr9ZEl3S3q4dnW2pGGSro3X75X0p2j/TkmbRPnxki6INh6JJK6UUkrzSUszYEknAUOBi6PoG5K2sX1sj/Wsb3oN+IztFyWtANwp6ZraQknrApcCX7J9r6RDgBm2h0paArhN0mjbj3Zhm4va3iIeuvBDSsBG1Y+AibZ3l7QDJcmrdsvRepSErWWBhySdbfvN6srRx0MA+i23oN/6nVJK80+rF2HtDAyK+06RdCEwkRJ9mGYR8DNJ2wHvAKsCK8WyFYE/U4I1arcj7QRsUnk60gDK7V5dGYCvip/jKQ9jqLcNEaNp+++S3he3OwFcF7eTvS7pmejrE9WVMwkrpZR6Rlee6bs88O94PaCjiguxfSkD7WDbb0qaRomPBJgBPA5szaz7gQUcbntUB22+xeynCpasW167H/ttGv8+1aCsNpDWp33lM55TSmk+afUc8M+ACfEs2wsps62f9Vy3+qwBwDMx+G4PrF5Z9gawO7B/5Zm9o4DDJC0GIGmdeNhC1dPA+2PmugQliKMrxlK+GCBpGPCc7byCPaWU2qzVGc+nKE/0eR74X8rTgP7VY73quy6mPKhiHHAv8GB1oe2X4wK2GyS9THkC0UDKlxsBz1IG6eo6b0o6AbiLcmh6tjZbcDzw20i8eoWS5DVXMgkrpZS6T6tJWDtQziVuC6xJGVzG2v5Vz3Yv9SaZhJVSSl3XLAmrpQE4GuhHuRJ6e0pm8Ku21+t4rbQgWWLltb3yAb9sdzdSN8uUrJR6VrMBuKVzwJJuBG6jPPz9IWDogjL4SjpC0gOSLo57nY+N8t0lbdDu/s2tvt7/lFJa0LV6EdYkykVEGwGbABtJ6srzcXuzrwI7297X9jW2T4ry3YG+PIB1uf+S8irolFKaT1oagG0fZXs74DPA/wG/BV7oyY7ND5LOoZzTvkbSUZKGSzpT0lbArsApku6VtFYHqVMN06wkrSxpbKw/RdK2UXdkvJ8s6agGfVpJ0tWS7ot/W0X5F2Pb90o6N04JIGmmpJ9G3Ttj/Ub9X0vS9ZLGS7pF0nqx/khJp0m6CTh5PnzsKaWUaD0J6+uUC7AGA49Rroi+pQf7NV/YPlTSJ4HtbT8naXiU3x4JVtfavgKgXKTcMHXqIBqkWQGfBUbZ/mkMlktTEqhWtb1RtLl8g26dDtxs+zOxXn9J61MO/28dV0WfRbm16HfAMsCdtr8v6efAwbZ/0qD/NwKH2v6HpA9Tnim8Q2xzHWBH22/XdyaTsFJKqWe0eshxKeA0YLztt3qwP71do9SpZmlW9wAXxD2+f4royUeANSWdAVwHjG6wjR2A/QFiQJwhaT/Kl5974ovAUsAzUf8N4NpKvz5e36Ck/sBWwOWxPsASlSqXNxp8ow+ZhJVSSj2g1achndLTHekjGqVONU2zikjKTwG/l3SK7d9J2hT4BPA14HPAgS1sV8CFtr/bYNmblWc0N0uzWgR4wfagBssAXm6hDymllLpRy09DWgi9RHlIQWcapllJWp2SinUe8Btgc5UHNCxi+0rgP4HNG7R3I3BYtNVP0nJRtqek90f5e6P9lvofyVePStor1ld8EUgppdQmedVrc5cC56k8pm/PDuo1S7MaBhwj6U1gJuWw8qqUVKraF59GM9pvACMkHUSZ0R5m+w5JPwBGx7pvUmbQj3Wh//sCZ0c7i8Xy+zr+CGaXSVgppdR9Wg7iSCmTsFJKqeuaBXHkDDi1bPKTMxh47HXt7kbqYZmMldL8keeAe1B9GlXcZ7xK5f0YSXN8K+qgva7WHxS3THVWb5ikazurl1JKqfvkANyz6tOohgOrNK7aIwYBnQ7AKaWU5r8cgBuIq5ivi3SpKZL2jvLBkm6ONKlRklaO8oMjCes+SVdKWrpBGtV3gCHAxfF+qbpt7iTpDkkTJF0e9+42spfmTONaUtJvI11roqTtJS0OnADsHdvbO/brgujrREm79dBHmFJKqRM5ADf2SeAp25tGatX1cZvRGcCetgdT0sB+GvWvsj3U9qbAA8BBtm8HrgGOsT3I9snAOGDfeP9qbWNxe9IPKGlUm0e9bzbp26K2twCOpKRxQbkiGtsbA/sAF1J+t8cBl8X2LgO+D/zddu2pVqdIWqajD0LSIZLGSRr39iszWvv0UkopdSovwmpsMnCqpJMpcY63SNqI8jCKGyJNqh8wPepvJOknwPJAf8q9wV3xEcqh6tui7cWBO5rUbZTGtQ3lywG2H5T0GCVest5OwK6Sjo73SwKrddSxTMJKKaWekQNwA7YfljSYcv70xMh2vhqYanvLBquMBHa3fV/kSQ/r4iYF3GB7nxbqNkvjanU7e9h+aLZCaaUW108ppdRN8hB0A3Gl8iu2LwJOpSRWPQSsKGnLqLOYpA1jlWWB6XGYet9KU/VpWs3Ste4Etpb0oWh7aUmNZrDNjK1tN9ZbLfpbv71RwOERGIKkzbqwjZRSSt0oZ8CNbUw5P/oOJXXqMNtvxAMXTpc0gPLZ/RKYSomVvIuSTDWZWYNefRrVSOAcSa8C786kbT8bM+dLVJ6oBOWc8MMt9vesaHcy8BYw3PbrKo8YPFbSvcCJwI+jz5NiEJ4G7NLyh5JJWCml1G0yCSu1LJOwUkqp6zIJK82zTMJaOGQSVkrzx0J/DljSKpKuqLy/RNIkSUdJOkHSjvOxL/VJWdPiFqWe3u73enobKaWUZrfQz4BtP0U87UjS/wO2st3Zo/66naR+lKSsKcBT83nz3wN+Np+3mVJKC7U+OwOWNFDSg5IujBnrFZKWjmXHRdrTFEkjKlf9fkjS3yKxaoKktaKdKdHsaOD9kRy1raSRceEVkoZKuj3WvVvSsnX9GaaS1XxF9OviynY/FslTkyOJaokonxZ9vZUSoNEoKevw6OtkSevFepMlLa/i/yTtH+W/l7SjynOET4nPYJKkr8TylSWNjfanxD6eBCwVZRf32C8spZTSbPrsABzWBUbY3gR4EfhqlJ8ZyVQbAUsx60rfi4FfR2LVVswK0qjZFfhnJEfdUitUiXW8DPhGrLsj8Cpz2oySULUBsCbl1qIlKVc/7x1JVYsCh1XWec32NnHLU6OkrOciHetsoBagcRuwNbAh8AiwbZR/hHJL00HAjEi8GgocLGkN4AvAKNuDgE2Be20fC7wa26zeQlXb90zCSimlHtDXB+DHbd8Wry+iJEIBbC/prrgtZwdgw5ixrmr7agDbr9l+pcXtrAtMt31PrPui7bca1Lvb9hO23wHupSRVrQs8art2S9GFwHaVdS7rZNuNkq9uiTa2owzMG0taFfi37ZmUxKv94/aju4D3AWsD9wBfknQ8sLHtlzrbcdsjbA+xPaTf0gM6q55SSqlFfX0Arr+HyjHjPIuS2bwxcB4lcrHVtKhG1GBbjbxeeV1Lqupsuy+32GY1+WosZda7LTAGeJZyHrs2axdweMxqB9lew/Zo22Mpg/aTwO9rh65TSinNf319AF6tlkxFOYd6K2WwBXhO5YlCe0KZtQJPSNodQNIStXPGLXgQWEXS0Fh3WUmtXsD2IDBQkXIF7Afc3KRus6Ss2dh+HFgBWNv2I5T9PppZA/Ao4DCVZC4kraPyJKTVgWdsnwf8hpLwBfBmrW5KKaX5o69fBf0AcICkc4F/AGfbfkXSeZREqmmUw641+wHnSjqBknC1F/BOZxuJFKy9gTPi4qhXKeeBZ7aw7muSvgRcHoP2PcA5TaqPpEFSVhN3UR4IAWXgPZEyEAOcTzlcPSEuBHuW8mziYcAxkt6MvtdmwCMo6VgTGp0HrskkrJRS6j59NglL0kDKk4o2anNXFhqZhJVSSl2nTMJK8yqTsBZOmYyVUs/os+eAbU/r67PfTKBKKaWFV58dgBcQLQ3AkZKVUkppAZIDcAck/WekWt2gkhF9dJSPkTQkXq8gaVq87rYEKkkzVbKo7wK27CBNq6OUrZ9JuiOCNDaXNErSPyUd2qxf8+FjTSmlRA7ATcUAuwcl3eqzlJjIznRbAhWwDDDF9ocpCVkjqUvTaiFl63HbW1Kukh5JuSXrI8AJsXyOfjX4HDIJK6WUekAOwM1tA/zZ9quRGPWXFtbptgQqSvDGlfG6WZpWZylb18TPycBdtl+y/SzwmqTlW+lXJmGllFLPyAG4uY4SrN5i1me3ZKW8OxOoXrP9did96Sxlq5ai9Q6zp3S9AyyayVgppdQ+OQA3dyvwaUlLRqJW9V6MacDgeL1npbynEqiapWl1JWVrDh30K6WUUg/L+4CbsH2PpGuA+4DHKOdhaydBTwX+KGk/4O+V1XokgapZmpbt17uQstVIs341lElYKaXUffpsEtb8IKm/7ZmRGT0WOMT2hHb3q10yCSullLouk7DmzghJG1DO8164MA++kElYqXOZmpVS6/IccAdsfyEuplrP9ont7s/ckPTXuOK5vnwvSQ9IuknSEEmnt6N/KaW0sMoZ8ALO9s5NFh0EfNX2TfE+jy2nlNJ8lDPgPkzStyUdEa9/Ienv8fpjki6K19MkrVC33nGU+5zPieSuYZKund/9TymlhVkOwH3bWKAWHzkE6B+3NW1DSb9qyPYJlBnvvraP6WgDmYSVUko9Iwfgvm08MFjSspSgjTsoA/G2dDAAd0UmYaWUUs/Ic8B9mO0340EQXwJuByYB2wNrAQ+0sWsppZQ6kTPgvm8scHT8vAU4lPKwh7zBO6WUerGcAfd9twDfB+6w/bKk1+imw8/1MgkrpZS6Tw7AfZztG4HFKu/XqVs+sMl6wyqvxwBjeqJ/KaWUGssBOLUsk7BSWnhkqlnPy3PACzhJx0s6Ol6PlLRnZ+uklFLqeTkAp5RSSm2QAzAQz+y9TtJ9kqZI2jvKp0k6WdLd8e9DUf5pSXdJmijpb5JWivL+kn4rabKkSZL2iPKdJN0haYKky+P5wvV9GFPZ1sOSto3yfpFWdU+0+ZXKOsdUyn9UKf++pIck/Q1Yt8k+D5Z0s6TxkkZJWrkbP9KUUkqdyAG4+CTwlO1NbW8EXF9Z9qLtLYAzgV9G2a3AR2xvBlwKfDvK/xOYYXtj25sAf48YyB8AO9renJJA9c0m/Vg0tnUk8MMoOyjaHAoMBQ6WtIaknYC1gS2AQZRAju0kDQY+D2wGfDbWmU2kZZ0B7Gl7MHAB8NNGHcokrJRS6hl5EVYxGThV0snAtbart/FcUvn5i3j9AeCymDUuDjwa5TtSBj8AbD8vaRdgA+A2SUT9O5r046r4OR4YGK93AjapnLsdQBl4d4p/E6O8f5QvC1xt+xUASdc02M66wEbADdGnfsD0Rh2yPQIYAbDEymvnvcUppdRNcgAGbD8cM8edgRMljY68ZIDqoFN7fQZwmu1rJA0Djo9y1dWvld1ge58WuvJ6/HybWb8bAYfbHjVbo9IngBNtn1tXfmSDPtQTMNX2li30KaWUUg/IQ9CApFWAV2xfBJwKbF5ZvHflZ23mOgB4Ml4fUKk7Gvh6pd33AHcCW1fOHy8tabZ7dTsxCjgsDhsjaR1Jy0T5gbXzyZJWlfR+SiLWZyQtFRnRn27Q5kPAipK2jHUXk7RhF/qUUkppHuUMuNgYOEXSO8CbwGGVZUtIuovyZaU2iz0euFzSk5QBdo0o/wnwa0lTKLPYH9m+StJw4BJJS0S9HwAPt9i38ymHoyeoHC9+Ftjd9mhJ6wN3xGHkmcAXbU+QdBlwL/AYDVKxbL8Rh7RPlzSA8nfwS2BqRx3JJKyUUuo+ysjg5uJBB0NsP9fuvvQGQ4YM8bhx49rdjZRS6lMkjbc9pL48Z8CpZZmElVLqrfpiclefPgdcTXnqgbYHAt/rS7NfSQMlfaHyfpCknSvvh0s6sz29SymlVNWnB+AeNhD4QmeVepmBzN7nQZQru1NKKfUyfW4AbpbyFLO9OyMV6mpJ75H0fknjY/mmkixptXj/z7gieaSk0yXdLumRyv22JwHbSrpX0lGSlqykXE2UtH2081dJm8TriZKOi9c/lvRlScMi5eoKSQ9KujgupqrfryMk3R/9vzTKlpF0QaRdTZS0W5QPlHSLSrLWBElbNejzd4ATgL3j/d5121tR0pXR9j2Stu6mX1FKKaUW9KlzwHUpT4sCEyihFQC/o9wve7OkE4Af2j4yBs7lgG0pKVTbSroVeMb2KzEWrgxsA6wHXANcARwLHG17l9j2twBsbyxpPWB03E40NtqcBrwF1AaybYCLou3NgA2Bp4Dbos6tdbt3LLCG7dclLR9l3wf+bvvAKLs7vng8A3zc9muS1qaEhAxp0OenKReRfT3eD69s71fAL2zfGl9KRgHrN/jMDwEOAei33IoNfy8ppZS6rk8NwJRBdI6Up7iVZnnbN0e9C4HL4/XtlAFvO+BnlNhJMfvtOX+y/Q5wvyLXuYFtKAEc2H5Q0mPAOtHOEZQ0rOuAj0taGhho+yGVtKy7bT8Rfb2Xcqi4fgCeBFws6U/An6JsJ2DXynnuJYHVKAP5mZIGUW536sp9xTU7AhtUJuPLSVrW9kvVSpmElVJKPaOvDcDQecpTvVsoA/fqwJ+B70Qb11bqvF55Pcfh4U7K76HMPh8BbgBWAA5m1sy8vv1qylXVpyhfEnYF/jOCMQTsYfuh2ToiHQ88DWxKOY3wWpO+dWQRYEvbr87FuimllOZRXzsH3DDlyfYM4HnFE4SA/YCbK+t8EfhHzHL/Tbkw6bZOtvUSJVe5uu19oaRRUWaiD9l+A3gc+BwllOMW4GgaBGA0I2kR4IO2b6I82GF5SrbzKODw2jljSZvFKgOA6bE/+1GynBv1uf59VX1q16BW+5tSSmne9akZcCcpTwcA58Th30eAL8U602L8Ghv1bgU+YPv5TjY3CXhL0n3ASOCsaH8y5VzvcNu1me0twMfinPItlIc1tDwAUwbQi+JQuijnZl+Q9GNKQtWkGISnAbtEX66UtBdwE/Bykz5fCBwbh71PrNvmEZTUrkmUv4OxwKEddTKTsFJKqftkElZqWSZhpZRS1ymTsNK8yiSslNLCYH6lavW1c8BtEffxzvHtpTeK+46v7bzmu/VnS8tKKaU0f+QAvACRNDdHNDItK6WU2iAH4BDpUg9KujDSqK6IC7rq650taZykqZJ+VCkfGmla90m6W9KykvpJOiWSpiZJ+koH2z1f0pRIytpR0m2S/iFpi6i3RbQ/MX6uG+XDJV0u6S+UK5up69NESWs2StWStDgdpGWllFLqOTkAz25dYITtTYAXga82qPP9OJm+CfBRSZvEQHYZ8A3bm1JCLl4FDgJm2B4KDAUOlrRGgzY/REmm2oSSxvUFSvDH0cD3os6DwHa2NwOOo4SK1GwJHGB7h1pBxFOeA+xm+xFmpWoNBbYHTgEWi7Yusz3I9mX1HZN0SHzhGPf2KzM6/PBSSim1Li/Cmt3jtmv3B19EuVXn1Lo6n4t4xkUpMZMbUII9ptu+B8D2iwCSdgI20ax86QHA2pTUrKpHbU+OdaYCN9p23PI0sLLuhRE9acrgWXOD7X9X3q9PSa/ayfZTUdYsVatDmYSVUko9Iwfg2dUPMLO9j9nr0cBQ289LGkkZyNRgXaL8cNujOtluNSnrncr7d5j1O/oxcJPtz6g8KnFMZZ2Xmd306NdmlNjKWl8apWp9uJO+pZRS6gF5CHp2q0naMl7vw5x5zctRBrsZkRn9H1H+ILCKpKEAcf53UUqS1WGSFovydSQtM5d9GwA8Ga+Hd1L3BUq05c8kDYuyZqlaHaVlpZRS6iE5A57dA8ABks4F/gGcXV1o+z5JE4GplLSt26L8jbiA6QxJS1HO/+4InE85hDwhBr5ngd3nsm8/pxyC/ibw984q235a0qeB/5Z0IGUG3ShV6yYqaVmNzgPXZBJWSil1n0zCCnFY91rbG7W5K71WJmGllFLXZRJWmmeZhJVSWhj1VDJWngMOtqf1xOw37u/dIF7P7IH2R1ausk4ppdRH5Ay4h9n+crv7kFJKqffJGXA36ChFq1GOtKQVJN0h6VPx/phKWtaPmmxjpqT/kjRB0o2SVmxQ57hoZ4qkEZUrnj8k6W+R0jVB0lqtbjellFLPyAG4+7SSokXcvnQdcJzt6yKsY21gC0ou82BJ2zVYdRlggu3NgZuBHzaoc6btoXEofSnKVc4AFwO/jpSurYDprW43k7BSSqln5ADcfepTtLZpUGcx4Ebg27ZviLKd4t9EYAIlinLtBuu+Q4m77Kj97SXdFQlaOwAbSloWWNX21QC2X7P9SqvbtT3C9hDbQ/otPaDDDyCllFLr8hxw9+kwRSu8BYwHPkGZxUJJqDrR9rnzsj1JSwJnAUNsPy7peGaldDUyt9tNKaXUDXIG3H06S9GCMmgeCKwn6dgoGwUcKKk/gKRVJb2/wbqLALWrnb/QoP0l4+dz0dae8G4u9ROSdo/2l4jz061uN6WUUg/IGXD36TBFq8b225I+D/xF0ou2z5K0PnBHXDM1E/gi8Ezdqi9TDimPB2YAe9e1+4Kk84DJlJSreyqL9wPOlXQC8Cawl+3RLW73XZmElVJK3SeTsLrB/EjRkjTTdv+ear8VmYSVUkpd1ywJKw9Bp5RSSm2Qh6C7ge1pQI9mSLd79ptSSql75Qw4pZRSaoMcgFNKKaU2yAE4pZRSaoMcgFNKKaU2yAE4pZRSaoMcgFNKKaU2yCCO1DJJLwEPtbsffcAKwHPt7kQfkJ9Ta/Jzak1v/pxWtz3HI2TzPuDUFQ81SnNJs5M0Lj+nzuXn1Jr8nFrTFz+nPASdUkoptUEOwCmllFIb5ACcumJEuzvQR+Tn1Jr8nFqTn1Nr+tznlBdhpZRSSm2QM+CUUkqpDXIATimllNogB+DUKUmflPSQpP+RdGy7+9NbSPqgpJskPSBpqqRvRPl7Jd0g6R/x8z3t7mtvIKmfpImSro33a0i6Kz6nyyQt3u4+9gaSlpd0haQH429ry/ybmpOko+K/uymSLpG0ZF/7m8oBOHVIUj/g18B/ABsA+0jaoL296jXeAr5le33gI8DX4rM5FrjR9trAjfE+wTeAByrvTwZ+EZ/T88BBbelV7/Mr4Hrb6wGbUj6z/JuqkLQqcAQwxPZGQD/g8/Sxv6kcgFNntgD+x/Yjtt8ALgV2a3OfegXb021PiNcvUf5HuSrl87kwql0I7N6eHvYekj4AfAo4P94L2AG4Iqrk5wRIWg7YDvgNgO03bL9A/k01siiwlKRFgaWB6fSxv6kcgFNnVgUer7x/IspShaSBwGbAXcBKtqdDGaSB97evZ73GL4FvA+/E+/cBL9h+K97n31WxJvAs8Ns4XH++pGXIv6nZ2H4SOBX4X8rAOwMYTx/7m8oBOHVGDcry3rUKSf2BK4Ejbb/Y7v70NpJ2AZ6xPb5a3KBq/l2VWd3mwNm2NwNeZiE/3NxInAPfDVgDWAVYhnKarF6v/pvKATh15gngg5X3HwCealNfeh1Ji1EG34vt/9/e3YPKVYRxGH/+JN4iIEisApIEIVikUbRQkyKJViLaRC0iBsU6iGihVhZpLVJGDGrQQkIgaVJdb8QPMBbxA7XToBf8SLBRC9H4Wsxc3NjsJcTMudzn1+zZc86efXeZPe/OzDkzdbKv/inJlr59C/DzqPgmYhfwUJILtC6MfbQa8U29+RAsVyuWgeWq+rg/P0FLyJapK90PfFtVF6vqT+AkcC9rrEyZgDXPJ8COfnXhAu1Ch9ODY5qE3o/5GvB1Vb0ys+k0cLAvHwROXe/YpqSqXqiqW6pqO638vFtVB4AlYH/fbd1/TwBV9SPwfZLb+qr7gK+wTP3Xd8DdSTb13+HK97SmypQjYWmuJA/QaiwbgGNVdXhwSJOQZDfwPvAF//ZtvkjrB34H2Eo7UTxSVb8MCXJikuwBnquqB5PcSqsRbwbOA49X1R8j45uCJLfTLlZbAL4BnqRVlixTM5K8DDxGuxvhPPA0rc93zZQpE7AkSQPYBC1J0gAmYEmSBjABS5I0gAlYkqQBTMCSJA1gApa0riR5Jsmm0XFI3oYkaV3pI3LdVVWXRsei9c0asKTJSfJEks+TfJbkeJJtSRb7usUkW/t+ryfZP/O63/rjniRnZ+bVfSvNIdrYwUtJlsZ8OqnZOH8XSbp+kuwEXgJ2VdWlJJtpU8u9WVVvJHkKOML8qebuAHbSxgP+sB/vSJJngb3WgDWaNWBJU7MPOLGSIPuQi/cAb/ftx4HdqzjOuaparqq/gU+B7f9DrNJVMwFLmpowfxq5le1/0c9jfVD+hZl9ZscAvowtfpoYE7CkqVkEHk1yM0Bvgv6INpMSwAHgg758AbizLz8M3LCK4/8K3HitgpWulv8IJU1KVX2Z5DDwXpLLtFltDgHHkjwPXKTNEATwKnAqyTla4v59FW9xFDiT5Ieq2nvtP4G0Ot6GJEnSj2b25gAAAC1JREFUADZBS5I0gAlYkqQBTMCSJA1gApYkaQATsCRJA5iAJUkawAQsSdIA/wABkKl3OBTzLAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "common_words = get_top_n_words(df['desc'], 20)\n",
    "df_common_words = pd.DataFrame(common_words, columns=['desc', 'count'])\n",
    "plt.barh(df_common_words['desc'], df_common_words['count'])\n",
    "plt.xlabel('count')\n",
    "plt.ylabel('words')\n",
    "plt.title('top 20')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这样所有的词都连起来了，第一个词Pike Place是西雅图的一个广场、以及wifi等关键字眼。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 文本清洗\n",
    "描述的一些统计信息"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['word_count'] = df['desc'].apply(lambda x:len(str(x).split()))  # 计算每个描述的长度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>address</th>\n",
       "      <th>desc</th>\n",
       "      <th>word_count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Hilton Garden Seattle Downtown</td>\n",
       "      <td>1821 Boren Avenue, Seattle Washington 98101 USA</td>\n",
       "      <td>Located on the southern tip of Lake Union, the...</td>\n",
       "      <td>184</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sheraton Grand Seattle</td>\n",
       "      <td>1400 6th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Located in the city's vibrant core, the Sherat...</td>\n",
       "      <td>152</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Crowne Plaza Seattle Downtown</td>\n",
       "      <td>1113 6th Ave, Seattle, WA 98101</td>\n",
       "      <td>Located in the heart of downtown Seattle, the ...</td>\n",
       "      <td>147</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Kimpton Hotel Monaco Seattle</td>\n",
       "      <td>1101 4th Ave, Seattle, WA98101</td>\n",
       "      <td>What?s near our hotel downtown Seattle locatio...</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>The Westin Seattle</td>\n",
       "      <td>1900 5th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Situated amid incredible shopping and iconic a...</td>\n",
       "      <td>151</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                             name  \\\n",
       "0  Hilton Garden Seattle Downtown   \n",
       "1          Sheraton Grand Seattle   \n",
       "2   Crowne Plaza Seattle Downtown   \n",
       "3   Kimpton Hotel Monaco Seattle    \n",
       "4              The Westin Seattle   \n",
       "\n",
       "                                           address  \\\n",
       "0  1821 Boren Avenue, Seattle Washington 98101 USA   \n",
       "1   1400 6th Avenue, Seattle, Washington 98101 USA   \n",
       "2                  1113 6th Ave, Seattle, WA 98101   \n",
       "3                   1101 4th Ave, Seattle, WA98101   \n",
       "4   1900 5th Avenue, Seattle, Washington 98101 USA   \n",
       "\n",
       "                                                desc  word_count  \n",
       "0  Located on the southern tip of Lake Union, the...         184  \n",
       "1  Located in the city's vibrant core, the Sherat...         152  \n",
       "2  Located in the heart of downtown Seattle, the ...         147  \n",
       "3  What?s near our hotel downtown Seattle locatio...         150  \n",
       "4  Situated amid incredible shopping and iconic a...         151  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAANR0lEQVR4nO3db4xld13H8ffHLv8K1QKdktoyThubBkKgJRNorVEsYBbaUB/0AQ1g1Zp5IloMCW5DIvFZjQbQaNCNrZDYFGOB0LRRuikQYoLF3bLAlm1pwRXWVrak/DFoLNWvD+ZsMw67M7P3npnZ79z3K7m59/zub+75/s7c+cyZc8/5TaoKSVI/P7HdBUiSJmOAS1JTBrgkNWWAS1JTBrgkNbVrK1d2zjnn1MLCwlauUpLaO3DgwHeqam51+5YG+MLCAvv379/KVUpSe0n+9UTtHkKRpKYMcElqygCXpKYMcElqygCXpKYMcElqygCXpKYMcElqygCXpKa29EpM9bCw556TPnfklqu3sBJJa3EPXJKaMsAlqSkDXJKaMsAlqSkDXJKaMsAlqSkDXJKaMsAlqSkDXJKaMsAlqSkDXJKaWjfAk9yW5FiSQyva/ijJQ0m+nOQTSc7e3DIlSattZA/8w8DuVW37gFdU1SuBrwE3j1yXJGkd6wZ4VX0OeHJV271V9fSw+E/ABZtQmyRpDWMcA/8N4O9HeB1J0imYKsCTvBd4Grh9jT5LSfYn2f/EE09MszpJ0goTB3iSG4BrgLdVVZ2sX1XtrarFqlqcm5ubdHWSpFUm+o88SXYDvwf8YlX957glSZI2YiOnEd4BfB64JMnRJDcCfwacBexLcjDJX2xynZKkVdbdA6+q60/QfOsm1CJJOgVeiSlJTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTRngktSUAS5JTa0b4EluS3IsyaEVbS9Ksi/JI8P9Cze3TEnSahvZA/8wsHtV2x7gvqq6GLhvWJYkbaF1A7yqPgc8uar5WuAjw+OPAL8ycl2SpHVMegz8JVX1OMBwf+54JUmSNmLXZq8gyRKwBDA/P7/Zq9MJLOy554TtR265eosrkTSmSffAv53kPIDh/tjJOlbV3qparKrFubm5CVcnSVpt0gC/C7hheHwD8MlxypEkbdRGTiO8A/g8cEmSo0luBG4B3pjkEeCNw7IkaQutewy8qq4/yVOvH7kWSdIp8EpMSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpjZ9OllNbrOngT3Z60/yNZs9Na1T4ko/zj1wSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpgxwSWrKAJekpqYK8CS/m+TBJIeS3JHkuWMVJkla28QBnuR84HeAxap6BXAG8NaxCpMkrW3aQyi7gOcl2QWcCTw2fUmSpI2YeD7wqvq3JH8MfBP4L+Deqrp3db8kS8ASwPz8/KSr0w7j/N7S9KY5hPJC4FrgQuCngecnefvqflW1t6oWq2pxbm5u8kolSf/PNIdQ3gD8S1U9UVU/Aj4O/Nw4ZUmS1jNNgH8TuDzJmUkCvB44PE5ZkqT1TBzgVXU/cCfwAPCV4bX2jlSXJGkdU/1T46p6H/C+kWqRJJ0Cr8SUpKYMcElqygCXpKYMcElqygCXpKYMcElqygCXpKYMcElqygCXpKYMcElqaqpL6Ts62TzUcPK5qLvMXb3W2CTtPO6BS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNTVVgCc5O8mdSR5KcjjJFWMVJkla27SzEf4J8A9VdV2SZwNnjlCTJGkDJg7wJD8J/ALwawBV9RTw1DhlSZLWM80e+EXAE8BfJ3kVcAC4qap+uLJTkiVgCWB+fn6K1Z2a03Fu7C7zio/pVL8Pp+P3TTpdTXMMfBfwauBDVXUZ8ENgz+pOVbW3qharanFubm6K1UmSVpomwI8CR6vq/mH5TpYDXZK0BSYO8Kr6d+BbSS4Zml4PfHWUqiRJ65r2LJTfBm4fzkD5BvDr05ckSdqIqQK8qg4CiyPVIkk6BV6JKUlNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNGeCS1JQBLklNTR3gSc5I8sUkd49RkCRpY8bYA78JODzC60iSTsFUAZ7kAuBq4K/GKUeStFG7pvz6DwLvAc46WYckS8ASwPz8/JSr25kW9tyzqf13spNtiyO3XL2prz/mOqRJTbwHnuQa4FhVHVirX1XtrarFqlqcm5ubdHWSpFWmOYRyJfCWJEeAjwJXJfmbUaqSJK1r4gCvqpur6oKqWgDeCny6qt4+WmWSpDV5HrgkNTXth5gAVNVngc+O8VqSpI1xD1ySmjLAJakpA1ySmjLAJakpA1ySmjLAJakpA1ySmjLAJakpA1ySmjLAJampUS6l3ymcl1tSJ+6BS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNWWAS1JTBrgkNTVxgCd5aZLPJDmc5MEkN41ZmCRpbdPMRvg08O6qeiDJWcCBJPuq6qsj1SZJWsPEe+BV9XhVPTA8/g/gMHD+WIVJktY2ynzgSRaAy4D7T/DcErAEMD8/P8bqdBrqMjf6yeo8csvVW1yJNL2pP8RM8gLgY8C7quoHq5+vqr1VtVhVi3Nzc9OuTpI0mCrAkzyL5fC+vao+Pk5JkqSNmOYslAC3Aoer6v3jlSRJ2ohp9sCvBN4BXJXk4HB780h1SZLWMfGHmFX1j0BGrEWSdAq8ElOSmjLAJakpA1ySmjLAJakpA1ySmjLAJakpA1ySmjLAJakpA1ySmjLAJampUeYD3wpd5pvW6eFU3y9jvr9Odc7xsdbdaU7z021e9rHqWet7uRljcw9ckpoywCWpKQNckpoywCWpKQNckpoywCWpKQNckpoywCWpKQNckpoywCWpKQNckpoywCWpqakCPMnuJA8neTTJnrGKkiStb+IAT3IG8OfAm4CXA9cneflYhUmS1jbNHvhrgEer6htV9RTwUeDaccqSJK0nVTXZFybXAbur6jeH5XcAr62qd67qtwQsDYuXAA+veqlzgO9MVMTOMMvjn+Wxg+Of5fGf6th/pqrmVjdO8w8dcoK2H/ttUFV7gb0nfZFkf1UtTlFHa7M8/lkeOzj+WR7/WGOf5hDKUeClK5YvAB6brhxJ0kZNE+D/DFyc5MIkzwbeCtw1TlmSpPVMfAilqp5O8k7gU8AZwG1V9eAEL3XSwyszYpbHP8tjB8c/y+MfZewTf4gpSdpeXokpSU0Z4JLU1LYF+Cxchp/ktiTHkhxa0faiJPuSPDLcv3BoT5I/HbbHl5O8evsqH0eSlyb5TJLDSR5MctPQvuO3QZLnJvlCki8NY/+Dof3CJPcPY//b4QQAkjxnWH50eH5hO+sfS5Izknwxyd3D8syMP8mRJF9JcjDJ/qFt1Pf+tgT4DF2G/2Fg96q2PcB9VXUxcN+wDMvb4uLhtgR8aItq3ExPA++uqpcBlwO/NXyfZ2Eb/DdwVVW9CrgU2J3kcuAPgQ8MY/8ucOPQ/0bgu1X1s8AHhn47wU3A4RXLszb+X6qqS1ec8z3ue7+qtvwGXAF8asXyzcDN21HLFox1ATi0Yvlh4Lzh8XnAw8PjvwSuP1G/nXIDPgm8cda2AXAm8ADwWpavvts1tD/zc8Dy2VxXDI93Df2y3bVPOe4LhpC6Crib5Yv/Zmn8R4BzVrWN+t7frkMo5wPfWrF8dGibBS+pqscBhvtzh/YdvU2GP4kvA+5nRrbBcPjgIHAM2Ad8HfheVT09dFk5vmfGPjz/feDFW1vx6D4IvAf432H5xczW+Au4N8mBYUoRGPm9P82l9NPY0GX4M2bHbpMkLwA+Bryrqn6QnGioy11P0NZ2G1TV/wCXJjkb+ATwshN1G+531NiTXAMcq6oDSV53vPkEXXfk+AdXVtVjSc4F9iV5aI2+E41/u/bAZ/ky/G8nOQ9guD82tO/IbZLkWSyH9+1V9fGheaa2QVV9D/gsy58DnJ3k+I7TyvE9M/bh+Z8CntzaSkd1JfCWJEdYnqn0Kpb3yGdl/FTVY8P9MZZ/gb+Gkd/72xXgs3wZ/l3ADcPjG1g+Lny8/VeHT6MvB75//E+trrK8q30rcLiq3r/iqR2/DZLMDXveJHke8AaWP8z7DHDd0G312I9vk+uAT9dwMLSjqrq5qi6oqgWWf74/XVVvY0bGn+T5Sc46/hj4ZeAQY7/3t/EA/5uBr7F8XPC92/2BwyaN8Q7gceBHLP+GvZHl43r3AY8M9y8a+oblM3O+DnwFWNzu+kcY/8+z/Gfgl4GDw+3Ns7ANgFcCXxzGfgj4/aH9IuALwKPA3wHPGdqfOyw/Ojx/0XaPYcRt8Trg7lka/zDOLw23B49n3NjvfS+ll6SmvBJTkpoywCWpKQNckpoywCWpKQNckpoywCWpKQNckpr6PxsSaXwiGw21AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.hist(df['word_count'], bins=50)\n",
    "plt.show()  # 绝大多数是250内的，不会是太长的"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 132,
   "metadata": {},
   "outputs": [],
   "source": [
    "def clean_txt(text):\n",
    "    sub_replace = re.compile('[^0-9a-z]')  # 去掉非数值及英文的\n",
    "    text = sub_replace.sub(' ', text)\n",
    "    return text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 153,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>address</th>\n",
       "      <th>desc</th>\n",
       "      <th>word_count</th>\n",
       "      <th>desc_clean</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Hilton Garden Seattle Downtown</td>\n",
       "      <td>1821 Boren Avenue, Seattle Washington 98101 USA</td>\n",
       "      <td>Located on the southern tip of Lake Union, the...</td>\n",
       "      <td>184</td>\n",
       "      <td>located southern tip lake union  hilton garden...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sheraton Grand Seattle</td>\n",
       "      <td>1400 6th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Located in the city's vibrant core, the Sherat...</td>\n",
       "      <td>152</td>\n",
       "      <td>located city vibrant core  sheraton grand seat...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Crowne Plaza Seattle Downtown</td>\n",
       "      <td>1113 6th Ave, Seattle, WA 98101</td>\n",
       "      <td>Located in the heart of downtown Seattle, the ...</td>\n",
       "      <td>147</td>\n",
       "      <td>located heart downtown seattle  award winning ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Kimpton Hotel Monaco Seattle</td>\n",
       "      <td>1101 4th Ave, Seattle, WA98101</td>\n",
       "      <td>What?s near our hotel downtown Seattle locatio...</td>\n",
       "      <td>150</td>\n",
       "      <td>near hotel downtown seattle location  better  ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>The Westin Seattle</td>\n",
       "      <td>1900 5th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Situated amid incredible shopping and iconic a...</td>\n",
       "      <td>151</td>\n",
       "      <td>situated amid incredible shopping iconic attra...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                             name  \\\n",
       "0  Hilton Garden Seattle Downtown   \n",
       "1          Sheraton Grand Seattle   \n",
       "2   Crowne Plaza Seattle Downtown   \n",
       "3   Kimpton Hotel Monaco Seattle    \n",
       "4              The Westin Seattle   \n",
       "\n",
       "                                           address  \\\n",
       "0  1821 Boren Avenue, Seattle Washington 98101 USA   \n",
       "1   1400 6th Avenue, Seattle, Washington 98101 USA   \n",
       "2                  1113 6th Ave, Seattle, WA 98101   \n",
       "3                   1101 4th Ave, Seattle, WA98101   \n",
       "4   1900 5th Avenue, Seattle, Washington 98101 USA   \n",
       "\n",
       "                                                desc  word_count  \\\n",
       "0  Located on the southern tip of Lake Union, the...         184   \n",
       "1  Located in the city's vibrant core, the Sherat...         152   \n",
       "2  Located in the heart of downtown Seattle, the ...         147   \n",
       "3  What?s near our hotel downtown Seattle locatio...         150   \n",
       "4  Situated amid incredible shopping and iconic a...         151   \n",
       "\n",
       "                                          desc_clean  \n",
       "0  located southern tip lake union  hilton garden...  \n",
       "1  located city vibrant core  sheraton grand seat...  \n",
       "2  located heart downtown seattle  award winning ...  \n",
       "3  near hotel downtown seattle location  better  ...  \n",
       "4  situated amid incredible shopping iconic attra...  "
      ]
     },
     "execution_count": 153,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 过滤掉不需要保留的\n",
    "from nltk.corpus import stopwords\n",
    "set_stopwords = set(stopwords.words('english'))\n",
    "\n",
    "df['desc_clean'] = df['desc'].str.lower()  # 全部转小写\n",
    "df['desc_clean'] = df['desc_clean'].apply(clean_txt)\n",
    "df['desc_clean'] = df['desc_clean'].str.split(' ').apply(lambda x: ' '.join(k for k in x if k not in set_stopwords))  # 去掉停用词\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 154,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"Located on the southern tip of Lake Union, the Hilton Garden Inn Seattle Downtown hotel is perfectly located for business and leisure. \\nThe neighborhood is home to numerous major international companies including Amazon, Google and the Bill & Melinda Gates Foundation. A wealth of eclectic restaurants and bars make this area of Seattle one of the most sought out by locals and visitors. Our proximity to Lake Union allows visitors to take in some of the Pacific Northwest's majestic scenery and enjoy outdoor activities like kayaking and sailing. over 2,000 sq. ft. of versatile space and a complimentary business center. State-of-the-art A/V technology and our helpful staff will guarantee your conference, cocktail reception or wedding is a success. Refresh in the sparkling saltwater pool, or energize with the latest equipment in the 24-hour fitness center. Tastefully decorated and flooded with natural light, our guest rooms and suites offer everything you need to relax and stay productive. Unwind in the bar, and enjoy American cuisine for breakfast, lunch and dinner in our restaurant. The 24-hour Pavilion Pantry? stocks a variety of snacks, drinks and sundries.\""
      ]
     },
     "execution_count": 154,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['desc'][0]  # 比较两者的差异"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 155,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'located southern tip lake union  hilton garden inn seattle downtown hotel perfectly located business leisure   neighborhood home numerous major international companies including amazon  google bill   melinda gates foundation  wealth eclectic restaurants bars make area seattle one sought locals visitors  proximity lake union allows visitors take pacific northwest majestic scenery enjoy outdoor activities like kayaking sailing  2 000 sq  ft  versatile space complimentary business center  state art v technology helpful staff guarantee conference  cocktail reception wedding success  refresh sparkling saltwater pool  energize latest equipment 24 hour fitness center  tastefully decorated flooded natural light  guest rooms suites offer everything need relax stay productive  unwind bar  enjoy american cuisine breakfast  lunch dinner restaurant  24 hour pavilion pantry  stocks variety snacks  drinks sundries '"
      ]
     },
     "execution_count": 155,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['desc_clean'][0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 相似度计算"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 156,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.set_index('name', inplace=True)  # 把name变成索引"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 158,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算每个词的权重水平\n",
    "tf = TfidfVectorizer(analyzer='word', ngram_range=(1,3), stop_words='english')\n",
    "tfidf_matrix = tf.fit_transform(df['desc_clean'])  # 转换当前数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 161,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(152, 26631)"
      ]
     },
     "execution_count": 161,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tfidf_matrix.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 165,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(152, 152)"
      ]
     },
     "execution_count": 165,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "consine_similarity = linear_kernel(tfidf_matrix, tfidf_matrix)  # 计算相似度\n",
    "consine_similarity.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 166,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1.        , 0.01406466, 0.03391973, 0.00993816, 0.03246863,\n",
       "       0.01501356, 0.02084233, 0.01581231, 0.00776991, 0.01999756,\n",
       "       0.0182464 , 0.01231142, 0.01684817, 0.0119307 , 0.01085672,\n",
       "       0.01791009, 0.0111671 , 0.04070581, 0.00971403, 0.02608081,\n",
       "       0.03035044, 0.00885341, 0.01056546, 0.02009413, 0.01868132,\n",
       "       0.02816165, 0.0321467 , 0.00681797, 0.02538754, 0.01969646,\n",
       "       0.01638717, 0.04434173, 0.0167791 , 0.02169556, 0.03728075,\n",
       "       0.03902235, 0.0069193 , 0.01352541, 0.04098731, 0.03227337,\n",
       "       0.0172481 , 0.01166389, 0.01520804, 0.03544255, 0.04699436,\n",
       "       0.01310661, 0.03274589, 0.0161937 , 0.03786155, 0.01421505,\n",
       "       0.0266454 , 0.01830098, 0.03764235, 0.01329187, 0.02744756,\n",
       "       0.01454037, 0.02460386, 0.03082779, 0.01229374, 0.02683908,\n",
       "       0.03151467, 0.01008901, 0.04523004, 0.0312478 , 0.0323932 ,\n",
       "       0.01846074, 0.03120115, 0.01118123, 0.02208553, 0.01201834,\n",
       "       0.02355357, 0.01679123, 0.02597236, 0.02219805, 0.02335901,\n",
       "       0.04484254, 0.00131829, 0.02258004, 0.01596417, 0.02875198,\n",
       "       0.00728455, 0.01550146, 0.00586358, 0.00886017, 0.01505134,\n",
       "       0.04805398, 0.01154452, 0.00439089, 0.00890586, 0.01341109,\n",
       "       0.00761107, 0.00443603, 0.0146058 , 0.00493675, 0.01795282,\n",
       "       0.01702045, 0.01116872, 0.02318485, 0.01508132, 0.02823554,\n",
       "       0.01212307, 0.00548954, 0.00335406, 0.02440467, 0.00912747,\n",
       "       0.02412254, 0.04179826, 0.02109056, 0.01228275, 0.03570519,\n",
       "       0.05331295, 0.00886831, 0.0258668 , 0.01566466, 0.0267365 ,\n",
       "       0.07529637, 0.01660016, 0.0371029 , 0.0114389 , 0.01876546,\n",
       "       0.00671789, 0.01194306, 0.01871489, 0.00346884, 0.00876216,\n",
       "       0.00946862, 0.04517183, 0.07370297, 0.00884079, 0.01411685,\n",
       "       0.01406232, 0.0124469 , 0.02123197, 0.01859324, 0.02939583,\n",
       "       0.00481356, 0.0358775 , 0.01307147, 0.0136874 , 0.01567845,\n",
       "       0.01888209, 0.02270796, 0.02684905, 0.01715449, 0.00317041,\n",
       "       0.00237712, 0.0237994 , 0.00739057, 0.00643772, 0.01595671,\n",
       "       0.00239758, 0.00730286])"
      ]
     },
     "execution_count": 166,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "consine_similarity[0]  # 第0个与全部矩阵内容的相似度计算"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 得出推荐结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 167,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    Hilton Garden Seattle Downtown\n",
       "1            Sheraton Grand Seattle\n",
       "2     Crowne Plaza Seattle Downtown\n",
       "3     Kimpton Hotel Monaco Seattle \n",
       "4                The Westin Seattle\n",
       "Name: name, dtype: object"
      ]
     },
     "execution_count": 167,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "indices = pd.Series(df.index)  # 单独拿出做索引\n",
    "indices[:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 172,
   "metadata": {},
   "outputs": [],
   "source": [
    "def recommendations(name, consine_similarity):\n",
    "    \"\"\"\n",
    "    推荐酒店\n",
    "    name: 用户浏览的酒店\n",
    "    consine_similarity: 酒店相似度信息\n",
    "    \"\"\"\n",
    "    recommended_hotels = []  # 推荐列表\n",
    "    idx = indices[indices == name].index[0]  # 获取当前用户浏览的酒店\n",
    "    score_series = pd.Series(consine_similarity[idx]).sort_values(ascending=False)  # 找到对应的酒店并降序\n",
    "    top_10_indexes = list(score_series[1:11].index)  # 获取前10个索引，第一个是自己，需要排除掉\n",
    "    for i in top_10_indexes:\n",
    "        recommended_hotels.append(list(df.index)[i])\n",
    "    \n",
    "    return recommended_hotels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 169,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>address</th>\n",
       "      <th>desc</th>\n",
       "      <th>word_count</th>\n",
       "      <th>desc_clean</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>name</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Hilton Garden Seattle Downtown</th>\n",
       "      <td>1821 Boren Avenue, Seattle Washington 98101 USA</td>\n",
       "      <td>Located on the southern tip of Lake Union, the...</td>\n",
       "      <td>184</td>\n",
       "      <td>located southern tip lake union  hilton garden...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Sheraton Grand Seattle</th>\n",
       "      <td>1400 6th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Located in the city's vibrant core, the Sherat...</td>\n",
       "      <td>152</td>\n",
       "      <td>located city vibrant core  sheraton grand seat...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Crowne Plaza Seattle Downtown</th>\n",
       "      <td>1113 6th Ave, Seattle, WA 98101</td>\n",
       "      <td>Located in the heart of downtown Seattle, the ...</td>\n",
       "      <td>147</td>\n",
       "      <td>located heart downtown seattle  award winning ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kimpton Hotel Monaco Seattle</th>\n",
       "      <td>1101 4th Ave, Seattle, WA98101</td>\n",
       "      <td>What?s near our hotel downtown Seattle locatio...</td>\n",
       "      <td>150</td>\n",
       "      <td>near hotel downtown seattle location  better  ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>The Westin Seattle</th>\n",
       "      <td>1900 5th Avenue, Seattle, Washington 98101 USA</td>\n",
       "      <td>Situated amid incredible shopping and iconic a...</td>\n",
       "      <td>151</td>\n",
       "      <td>situated amid incredible shopping iconic attra...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                        address  \\\n",
       "name                                                                              \n",
       "Hilton Garden Seattle Downtown  1821 Boren Avenue, Seattle Washington 98101 USA   \n",
       "Sheraton Grand Seattle           1400 6th Avenue, Seattle, Washington 98101 USA   \n",
       "Crowne Plaza Seattle Downtown                   1113 6th Ave, Seattle, WA 98101   \n",
       "Kimpton Hotel Monaco Seattle                     1101 4th Ave, Seattle, WA98101   \n",
       "The Westin Seattle               1900 5th Avenue, Seattle, Washington 98101 USA   \n",
       "\n",
       "                                                                             desc  \\\n",
       "name                                                                                \n",
       "Hilton Garden Seattle Downtown  Located on the southern tip of Lake Union, the...   \n",
       "Sheraton Grand Seattle          Located in the city's vibrant core, the Sherat...   \n",
       "Crowne Plaza Seattle Downtown   Located in the heart of downtown Seattle, the ...   \n",
       "Kimpton Hotel Monaco Seattle    What?s near our hotel downtown Seattle locatio...   \n",
       "The Westin Seattle              Situated amid incredible shopping and iconic a...   \n",
       "\n",
       "                                word_count  \\\n",
       "name                                         \n",
       "Hilton Garden Seattle Downtown         184   \n",
       "Sheraton Grand Seattle                 152   \n",
       "Crowne Plaza Seattle Downtown          147   \n",
       "Kimpton Hotel Monaco Seattle           150   \n",
       "The Westin Seattle                     151   \n",
       "\n",
       "                                                                       desc_clean  \n",
       "name                                                                               \n",
       "Hilton Garden Seattle Downtown  located southern tip lake union  hilton garden...  \n",
       "Sheraton Grand Seattle          located city vibrant core  sheraton grand seat...  \n",
       "Crowne Plaza Seattle Downtown   located heart downtown seattle  award winning ...  \n",
       "Kimpton Hotel Monaco Seattle    near hotel downtown seattle location  better  ...  \n",
       "The Westin Seattle              situated amid incredible shopping iconic attra...  "
      ]
     },
     "execution_count": 169,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()  # 看下大概信息，拿一个出来试试结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 173,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Silver Cloud Inn - Seattle Lake Union',\n",
       " 'Staybridge Suites Seattle Downtown - Lake Union',\n",
       " 'Residence Inn by Marriott Seattle Downtown/Lake Union',\n",
       " 'The Loyal Inn',\n",
       " 'The Arctic Club Seattle - a DoubleTree by Hilton Hotel',\n",
       " 'Embassy Suites by Hilton Seattle Tacoma International Airport',\n",
       " 'The Charter Hotel Seattle, Curio Collection by Hilton',\n",
       " 'MarQueen Hotel',\n",
       " 'Residence Inn by Marriott Seattle Downtown/Convention Center',\n",
       " 'EVEN Hotel Seattle - South Lake Union']"
      ]
     },
     "execution_count": 173,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "recommendations('Hilton Garden Seattle Downtown', consine_similarity)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 202,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"Located on the southern tip of Lake Union, the Hilton Garden Inn Seattle Downtown hotel is perfectly located for business and leisure. \\nThe neighborhood is home to numerous major international companies including Amazon, Google and the Bill & Melinda Gates Foundation. A wealth of eclectic restaurants and bars make this area of Seattle one of the most sought out by locals and visitors. Our proximity to Lake Union allows visitors to take in some of the Pacific Northwest's majestic scenery and enjoy outdoor activities like kayaking and sailing. over 2,000 sq. ft. of versatile space and a complimentary business center. State-of-the-art A/V technology and our helpful staff will guarantee your conference, cocktail reception or wedding is a success. Refresh in the sparkling saltwater pool, or energize with the latest equipment in the 24-hour fitness center. Tastefully decorated and flooded with natural light, our guest rooms and suites offer everything you need to relax and stay productive. Unwind in the bar, and enjoy American cuisine for breakfast, lunch and dinner in our restaurant. The 24-hour Pavilion Pantry? stocks a variety of snacks, drinks and sundries.\""
      ]
     },
     "execution_count": 202,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 我们拿最接近的两个来看下其中的描述\n",
    "df.loc[['Hilton Garden Seattle Downtown'],[\"desc\"]]['desc'][0]  # 浏览的酒店"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 203,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Located on the south end of scenic Lake Union, the Silver Cloud Hotel Seattle \\x96 Lake Union is two miles from downtown Seattle\\x92s finest shops and restaurants and just five miles from the University of Washington campus. The hotel is also located near Seattle Center and the Fred Hutchinson Cancer Research Center, and is only a 20 minute walk to the\\xa0iconic Space Needle. Most of the guestrooms have spectacular waterfront views of Lake Union and Seattle. All rooms have 55? high-definition flat-screen TVs complimentary high-speed wired and wireless Internet access, microwaves, refrigerators and laptop safes. A complimentary Silver Cloud breakfast is offered daily in addition to use of our expansive fitness center. The Silver Cloud Hotel \\x96 Lake Union features complimentary parking and a local area shuttle service. Located along the south shore of Lake Union, the Silver Cloud Inn is nestled in one of the fastest growing and most vibrant neighborhoods in the Seattle area. Offering a blend of the Pacific Northwest outdoors and modern urban living, Lake Union\\x92s distinctive neighborhood is home to several globally-recognized companies including Amazon, Fred Hutchinson Cancer Research Center and University of Washington Medicine. Whether you are traveling by streetcar or seaplane, visit Lake Union for a uniquely Seattle experience. Explore all the nearby attractions below.'"
      ]
     },
     "execution_count": 203,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Silver Cloud Inn - Seattle Lake Union'],[\"desc\"]]['desc'][0]  # 推荐1的酒店"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 204,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"The Staybridge Suites Seattle- South Lake Union, opening in 2018, offers guests a home away from home in the heart of the Emerald City of Seattle. Our upscale residential style hotel is conveniently located in the South Lake Union neighborhood, steps away from beautiful Lake Union and a quick jaunt to the Downtown corridor. Our all-suite hotel features full kitchens and your choice of studio or one bedroom suites. We can also accommodate your private business meetings or special events. Be sure to take advantage of our complimentary Monday through Wednesday evening social reception and free breakfast buffet every morning. Experience the many attractions that the South Lake Union area has to offer including the Space Needle, MOPOP, Pacific Science Center and The Museum of History & Industry. With our proximity to Lake Union, you'll have easy access to fun excursions such as boat rentals or taking in the Pacific Northwest by sea plane! South Lake Union is home to many businesses including Amazon, Facebook, Fred Hutchinson Cancer Research Center, Seattle Cancer Care Alliance and many more within walking distance of the hotel. All of our comfortable suites are newly renovated and beautifully appointed with their furnishings. Each suite includes a full kitchen with all the supporting utensils to prepare food, complimentary high speed internet, flat screen televisions with a variety of programming options and daily housekeeping.\""
      ]
     },
     "execution_count": 204,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Staybridge Suites Seattle Downtown - Lake Union'],[\"desc\"]]['desc'][0]  # 推荐2的酒店"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到三者都有关键词：On the Lake Union、The Pacific Northwest's等"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 总结：\n",
    "1. 先分析高频词，对高频词进行纠正。\n",
    "2. 这里用到的技术有stopwords(停用词)，对于一些非重要信息可以采用剔除的操作，以及re.compile对非信息的数据进行剔除。\n",
    "2. 利用TfidfVectorizer计算每句话中词的权重，并使用linear_kernel进行相似度计算。\n",
    "3. 最后在利用用户浏览的酒店的相关信息进行推荐。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
